## 'data.frame': 481 obs. of 25 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ sample_ID : chr "H-C-01-DNR-WET" "H-C-01-DNR-WET" "H-C-01-DNR-WET" "H-C-01-DNR-WET" ...
## $ site : chr "HOH" "HOH" "HOH" "HOH" ...
## $ depth_cm : int 13 40 76 104 11 30 45 75 9 27 ...
## $ helper : chr "H-C-01-DNR-WET|13" "H-C-01-DNR-WET|40" "H-C-01-DNR-WET|76" "H-C-01-DNR-WET|104" ...
## $ thickness_cm : int 13 27 36 28 11 19 15 30 9 18 ...
## $ rock_perc : num 0 0 5 40 0 15 40 55 0 0 ...
## $ BD_g_cm3 : num 0.07 0.16 0.51 0.88 0.38 0.49 0.7 0.57 0.72 0.95 ...
## $ field_texture : chr "P" "MP" "MM" "C" ...
## $ field_texture_binned: chr "O" "O" "O" "C" ...
## $ redox : chr NA NA NA NA ...
## $ carbon_perc : num 39.56 38.31 12.14 2.37 15.65 ...
## $ carbon_stock_g_cm2 : num 0.376 1.678 2.105 0.352 0.65 ...
## $ NDYI : num 0.543 0.543 0.543 0.543 0.579 ...
## $ Geomorphon_Class : int 10 10 10 10 10 10 10 10 10 10 ...
## $ Temp : num 10.2 10.2 10.2 10.2 10.2 ...
## $ Precip : num 3070 3070 3070 3070 3054 ...
## $ NDVI : num 0.84 0.84 0.84 0.84 0.862 ...
## $ MNDWI : num -0.477 -0.477 -0.477 -0.477 -0.446 ...
## $ EVI : num 0.455 0.455 0.455 0.455 0.534 ...
## $ SCI : num 0.649 0.649 0.649 0.649 0.648 ...
## $ GEOLOGIC_AGE : chr "Pleistocene" "Pleistocene" "Pleistocene" "Pleistocene" ...
## $ WIP : num 0.873 0.873 0.873 0.873 0.896 ...
## $ x : num 411350 411350 411350 411350 409931 ...
## $ y : num 5295086 5295086 5295086 5295086 5294415 ...
## 'data.frame': 276 obs. of 25 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ sample_ID : chr "CC_S26_R3" "CC_S26_R3" "CC_S26_R3" "CC_S27_R3" ...
## $ upper_depth : int 0 30 60 0 30 60 0 30 60 0 ...
## $ lower_depth : int 30 60 100 30 60 100 30 60 100 30 ...
## $ SOC_stock_spline : num 0.1636 0.0983 0.0531 0.6126 0.6936 ...
## $ site : chr "COL" "COL" "COL" "COL" ...
## $ geomorphons : chr "f" "f" "f" "i" ...
## $ HLI : num 0.849 0.849 0.849 0.502 0.502 ...
## $ Precip : num 629 629 629 664 664 ...
## $ Temp : num 6.66 6.66 6.66 6.03 6.03 ...
## $ WIP : num 0.209 0.209 0.209 0.127 0.127 ...
## $ tree_canopy_cover: num 31 31 31 56 56 ...
## $ NDVI : num 0.537 0.537 0.537 0.739 0.739 ...
## $ MNDWI : num -0.517 -0.517 -0.517 -0.554 -0.554 ...
## $ EVI : num 0.292 0.292 0.292 0.387 0.387 ...
## $ SCI : num 0.511 0.511 0.511 0.616 0.616 ...
## $ SAVI : num 0.29 0.29 0.29 0.383 0.383 ...
## $ EMBI : num 0.188 0.188 0.188 0.162 0.162 ...
## $ DSI : num 0.92 0.92 0.92 0.625 0.625 ...
## $ DSWI1 : num 1.16 1.16 1.16 1.61 1.61 ...
## $ NDYI : num 0.226 0.226 0.226 0.36 0.36 ...
## $ LSWI : num 0.0556 0.0556 0.0556 0.2314 0.2314 ...
## $ ANDWI : num -0.539 -0.539 -0.539 -0.643 -0.643 ...
## $ LITHOLOGY : chr "continental glacial outwash, Fraser-age" "continental glacial outwash, Fraser-age" "continental glacial outwash, Fraser-age" "continental glacial outwash, Fraser-age" ...
## $ GEOLOGIC_AGE : chr "Pleistocene" "Pleistocene" "Pleistocene" "Pleistocene" ...
With many potential predictors we can eliminate some by examining correlation plots
There is weak correlation between SOC_stock_spline
and
the selected covariates. Additionally there is some collinearity between
predictors. The spectral predictors such as EVI (Enhanced Vegetation
Index) and SCI (Soil Condition Index) are well correlated with each
other and also with NDVI (Normalized Difference Vegetation Index). We
can remove these based on the correlation coefficient > 0.7.
Now we can begin examining the distribution of carbon stock values in the dataset to choose the appropriate transformation. We also scale and center the numeric predictor variables.
## 'data.frame': 276 obs. of 25 variables:
## $ X : num [1:276, 1] -1.72 -1.71 -1.7 -1.69 -1.67 ...
## ..- attr(*, "scaled:center")= num 138
## ..- attr(*, "scaled:scale")= num 79.8
## $ sample_ID : chr "CC_S26_R3" "CC_S26_R3" "CC_S26_R3" "CC_S27_R3" ...
## $ upper_depth : num [1:276, 1] -1.1849 0.0492 1.2833 -1.1849 0.0492 ...
## ..- attr(*, "scaled:center")= num 28.8
## ..- attr(*, "scaled:scale")= num 24.3
## $ lower_depth : int 30 60 100 30 60 100 30 60 100 30 ...
## $ SOC_stock_spline : num 0.1636 0.0983 0.0531 0.6126 0.6936 ...
## $ site : chr "COL" "COL" "COL" "COL" ...
## $ geomorphons : chr "f" "f" "f" "i" ...
## $ HLI : num [1:276, 1] 1.264 1.264 1.264 -0.907 -0.907 ...
## ..- attr(*, "scaled:center")= num 0.647
## ..- attr(*, "scaled:scale")= num 0.16
## $ Precip : num [1:276, 1] -1.21 -1.21 -1.21 -1.18 -1.18 ...
## ..- attr(*, "scaled:center")= num 1870
## ..- attr(*, "scaled:scale")= num 1022
## $ Temp : num [1:276, 1] -1.09 -1.09 -1.09 -1.48 -1.48 ...
## ..- attr(*, "scaled:center")= num 8.42
## ..- attr(*, "scaled:scale")= num 1.62
## $ WIP : num [1:276, 1] -0.801 -0.801 -0.801 -1.08 -1.08 ...
## ..- attr(*, "scaled:center")= num 0.443
## ..- attr(*, "scaled:scale")= num 0.292
## $ tree_canopy_cover: num [1:276, 1] -2.25 -2.25 -2.25 -0.264 -0.264 ...
## ..- attr(*, "scaled:center")= num 59.3
## ..- attr(*, "scaled:scale")= num 12.6
## $ NDVI : num [1:276, 1] -3.5 -3.5 -3.5 -1.1 -1.1 ...
## ..- attr(*, "scaled:center")= num 0.832
## ..- attr(*, "scaled:scale")= num 0.0843
## $ MNDWI : num [1:276, 1] -0.324 -0.324 -0.324 -1.105 -1.105 ...
## ..- attr(*, "scaled:center")= num -0.502
## ..- attr(*, "scaled:scale")= num 0.0475
## $ EVI : num [1:276, 1] -1.932 -1.932 -1.932 -0.953 -0.953 ...
## ..- attr(*, "scaled:center")= num 0.48
## ..- attr(*, "scaled:scale")= num 0.0972
## $ SCI : num [1:276, 1] -2.81 -2.81 -2.81 -0.8 -0.8 ...
## ..- attr(*, "scaled:center")= num 0.658
## ..- attr(*, "scaled:scale")= num 0.0526
## $ SAVI : num [1:276, 1] -2.17 -2.17 -2.17 -1.02 -1.02 ...
## ..- attr(*, "scaled:center")= num 0.466
## ..- attr(*, "scaled:scale")= num 0.0811
## $ EMBI : num [1:276, 1] 1.52 1.52 1.52 1.3 1.3 ...
## ..- attr(*, "scaled:center")= num 0.00762
## ..- attr(*, "scaled:scale")= num 0.119
## $ DSI : num [1:276, 1] 3.03 3.03 3.03 1.2 1.2 ...
## ..- attr(*, "scaled:center")= num 0.432
## ..- attr(*, "scaled:scale")= num 0.161
## $ DSWI1 : num [1:276, 1] -1.99 -1.99 -1.99 -1.36 -1.36 ...
## ..- attr(*, "scaled:center")= num 2.58
## ..- attr(*, "scaled:scale")= num 0.714
## $ NDYI : num [1:276, 1] -2.215 -2.215 -2.215 -0.683 -0.683 ...
## ..- attr(*, "scaled:center")= num 0.419
## ..- attr(*, "scaled:scale")= num 0.0874
## $ LSWI : num [1:276, 1] -2.61 -2.61 -2.61 -1.33 -1.33 ...
## ..- attr(*, "scaled:center")= num 0.413
## ..- attr(*, "scaled:scale")= num 0.137
## $ ANDWI : num [1:276, 1] 2.92 2.92 2.92 1.08 1.08 ...
## ..- attr(*, "scaled:center")= num -0.703
## ..- attr(*, "scaled:scale")= num 0.0564
## $ LITHOLOGY : chr "continental glacial outwash, Fraser-age" "continental glacial outwash, Fraser-age" "continental glacial outwash, Fraser-age" "continental glacial outwash, Fraser-age" ...
## $ GEOLOGIC_AGE : chr "Pleistocene" "Pleistocene" "Pleistocene" "Pleistocene" ...
Now build models using log transformed carbon stock data. We need to
specify that sample_ID
is a random effect because of the
multiple samples at one location. lower_depth
will be a
random slope to adjust model based on how it is affected by depth. We
could use a Generalized Linear Model or a Generalized Linear Mixed Model
here too but they often fail to converge. We will model with the
log-transformed soil carbon stocks in the linear mixed model
# Full model with hypothesized parameters
mod1 <- lmer(log10(SOC_stock_spline) ~ WIP * Precip * Temp *
ANDWI * NDYI + (GEOLOGIC_AGE) + lower_depth + (lower_depth ||
sample_ID), data = wa_spl_dat_scale, REML = F)
# No interactions
mod2 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
ANDWI + NDYI + (GEOLOGIC_AGE) + lower_depth + (lower_depth ||
sample_ID), data = wa_spl_dat_scale, REML = F)
# No spectral
mod3 <- lmer(log10(SOC_stock_spline) ~ WIP * Precip * Temp +
(GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
# No geology
mod4 <- lmer(log10(SOC_stock_spline) ~ WIP * Precip * Temp +
ANDWI + NDYI + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
# No WIP w/ interaction
mod5 <- lmer(log10(SOC_stock_spline) ~ Precip * Temp + ANDWI +
NDYI + (GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 1.5562 (tol = 0.002, component 1)
# No WIP w/o interaction
mod6 <- lmer(log10(SOC_stock_spline) ~ Precip + Temp + ANDWI +
NDYI + (GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
# No climate
mod7 <- lmer(log10(SOC_stock_spline) ~ WIP + +ANDWI + NDYI +
(GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
# No climate No spectral
mod8 <- lmer(log10(SOC_stock_spline) ~ WIP + (GEOLOGIC_AGE) +
lower_depth + (lower_depth || sample_ID), data = wa_spl_dat_scale,
REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0035401 (tol = 0.002, component 1)
# Just WIP
mod9 <- lmer(log10(SOC_stock_spline) ~ WIP + lower_depth + (lower_depth ||
sample_ID), data = wa_spl_dat_scale, REML = F)
# NULL
mod10 <- lmer(log10(SOC_stock_spline) ~ lower_depth + (lower_depth ||
sample_ID), data = wa_spl_dat_scale, REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.431569 (tol = 0.002, component 1)
Note models 5, 8, and 10, failed to converge
Pairwise comparisons between the top, largest model and the rest show if models with fewer parameters are significantly different. Considering AIC will help determine the most parsimonious model with the best fit to the data.
Models | npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | Significant |
---|---|---|---|---|---|---|---|---|---|
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 36.52080 | 26 | 0.0824988 | |
mod3 | 17 | 223.7913 | 285.3381 | -94.89564 | 189.7913 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 35.88615 | 24 | 0.0562988 | |
mod4 | 14 | 227.0389 | 277.7245 | -99.51945 | 199.0389 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 45.13377 | 27 | 0.0157401 | *** |
mod5 | 15 | 229.4435 | 283.7495 | -99.72175 | 199.4435 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 45.53837 | 26 | 0.0102674 | *** |
mod6 | 14 | 229.4083 | 280.0939 | -100.70415 | 201.4083 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 47.50318 | 27 | 0.0087196 | *** |
mod7 | 13 | 230.4420 | 277.5072 | -102.22099 | 204.4420 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 50.53685 | 28 | 0.0056307 | *** |
mod8 | 11 | 247.7352 | 287.5596 | -112.86760 | 225.7352 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 71.83007 | 30 | 0.0000275 | *** |
mod9 | 6 | 245.6252 | 267.3476 | -116.81262 | 233.6252 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 79.72011 | 35 | 0.0000243 | *** |
mod10 | 5 | 247.8359 | 265.9379 | -118.91797 | 237.8359 | NA | NA | NA | |
mod1 | 41 | 235.9051 | 384.3416 | -76.95256 | 153.9051 | 83.93081 | 36 | 0.0000107 | *** |
From the ANOVAs it looks like model 2
&
model 3
performed just as well as
model 1
meaning we should proceed with these.
model 2
removed interactions and model 3
removed all the spectral metrics but kept interaction. However,
model 2
has 2 less parameters than model 3
which gives it a lower AIC when compared together.
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|---|
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | NA | NA | NA |
mod3 | 17 | 223.7913 | 285.3381 | -94.89564 | 189.7913 | 0.6346545 | 2 | 0.7280925 |
To hone in on differences between variations in model 2
we can remove some of the variables and compare AICs for a more
parsimonious fit.
# No ANDWI
mod2.1 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
NDYI + (GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
# No ANDWI, No NDYI
mod2.2 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
(GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0070341 (tol = 0.002, component 1)
# No NDYI
mod2.3 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
ANDWI + (GEOLOGIC_AGE) + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.00238014 (tol = 0.002, component 1)
# No Geology
mod2.4 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
ANDWI + NDYI + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.171981 (tol = 0.002, component 1)
# No NDYI or Geology
mod2.5 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
ANDWI + lower_depth + (lower_depth || sample_ID), data = wa_spl_dat_scale,
REML = F)
## Warning in checkConv(attr(opt, "derivs"), opt$par, ctrl = control$checkConv, :
## Model failed to converge with max|grad| = 0.0427664 (tol = 0.002, component 1)
# No ANDWI or Geology
mod2.6 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
NDYI + lower_depth + (lower_depth || sample_ID), data = wa_spl_dat_scale,
REML = F)
# No spectral or Geology
mod2.7 <- lmer(log10(SOC_stock_spline) ~ WIP + Precip + Temp +
lower_depth + (lower_depth || sample_ID), data = wa_spl_dat_scale,
REML = F)
Note models 2, 3, 4, 5 failed to converge
Models | npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | Significant |
---|---|---|---|---|---|---|---|---|---|
mod2.1 | 14 | 218.8356 | 269.5212 | -95.41779 | 190.8356 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 0.409648 | 1 | 0.5221482 | |
mod2.2 | 13 | 219.4581 | 266.5234 | -96.72907 | 193.4581 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 3.032215 | 2 | 0.2195649 | |
mod2.3 | 14 | 221.3225 | 272.0082 | -96.66127 | 193.3225 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 2.896615 | 1 | 0.0887658 | |
mod2.4 | 10 | 221.4128 | 257.6168 | -100.70639 | 201.4128 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 10.986856 | 5 | 0.0516412 | |
mod2.5 | 9 | 227.2392 | 259.8228 | -104.61960 | 209.2392 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 18.813263 | 6 | 0.0044909 | *** |
mod2.6 | 9 | 220.0621 | 252.6457 | -101.03105 | 202.0621 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 11.636166 | 6 | 0.0705954 | |
mod2.7 | 8 | 226.0611 | 255.0243 | -105.03055 | 210.0611 | NA | NA | NA | |
mod2 | 15 | 220.4259 | 274.7319 | -95.21296 | 190.4259 | 19.635164 | 7 | 0.0064141 | *** |
Model 2.1
and 2.2
are the lowest AIC and
not significantly different from the fit in Model 2
.
Because Model 2.2
failed to converge, we will move forward
with Model 2.1
.
Now we can look at the table of all models and compare AICs. We find
that Model 2.1 has the lowest AIC followed by Model 2.2
which failed to converge
AIC | formula | name | delta |
---|---|---|---|
235.91 | WIP * Precip * Temp * ANDWI * NDYI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod1 | 17.07 |
220.43 | WIP + Precip + Temp + ANDWI + NDYI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2 | 1.59 |
223.79 | WIP * Precip * Temp + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod3 | 4.95 |
227.04 | WIP * Precip * Temp + ANDWI + NDYI + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod4 | 8.20 |
229.44 | Precip * Temp + ANDWI + NDYI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod5 | 10.60 |
229.41 | Precip + Temp + ANDWI + NDYI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod6 | 10.57 |
230.44 | WIP + +ANDWI + NDYI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod7 | 11.60 |
247.74 | WIP + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod8 | 28.90 |
245.63 | WIP + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod9 | 26.79 |
218.84 | WIP + Precip + Temp + NDYI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.1 | 0.00 |
219.46 | WIP + Precip + Temp + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.2 | 0.62 |
221.32 | WIP + Precip + Temp + ANDWI + (GEOLOGIC_AGE) + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.3 | 2.48 |
221.41 | WIP + Precip + Temp + ANDWI + NDYI + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.4 | 2.57 |
227.24 | WIP + Precip + Temp + ANDWI + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.5 | 8.40 |
220.06 | WIP + Precip + Temp + NDYI + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.6 | 1.22 |
226.06 | WIP + Precip + Temp + lower_depth + ((1 | sample_ID) + (0 + lower_depth | sample_ID)) | mod2.7 | 7.22 |
The model \(R^2\) for
mod2.1
is 0.892
Confidence intervals calculated with bootstrapping show that
WIP
and Precip
are the most significant
contributors to SOC stock. the overlapping confidence intervals do not
show much separation between WIP
and Precip
but there is more variation in the Precip
confidence
interval.
Model 2.1 | |||
---|---|---|---|
Predictors | Estimates | CI | p |
(Intercept) | -0.24 | -0.63 – 0.11 | 0.192 |
WIP | 0.12 | 0.05 – 0.19 | <0.001 |
Precip | 0.18 | 0.03 – 0.32 | 0.020 |
Temp | -0.00 | -0.12 – 0.12 | 0.992 |
NDYI | 0.07 | -0.02 – 0.16 | 0.116 |
GEOLOGIC AGE [Miocene-Eocene] |
-0.44 | -0.90 – 0.09 | 0.108 |
GEOLOGIC AGE [Oligocene-Eocene] |
0.09 | -0.29 – 0.52 | 0.654 |
GEOLOGIC AGE [Pleistocene] |
0.04 | -0.33 – 0.46 | 0.844 |
GEOLOGIC AGE [pre-Tertiary] |
-0.05 | -0.56 – 0.48 | 0.846 |
GEOLOGIC AGE [Quaternary] | -0.28 | -0.70 – 0.20 | 0.280 |
lower depth | -0.01 | -0.01 – -0.01 | <0.001 |
Random Effects | |||
σ2 | 0.05 | ||
τ00 sample_ID | 0.04 | ||
τ11 sample_ID.lower_depth | 0.00 | ||
ρ01 | |||
ρ01 | |||
ICC | 0.40 | ||
N sample_ID | 96 | ||
Observations | 276 | ||
Marginal R2 / Conditional R2 | 0.554 / 0.731 |
Looking at some partial dependence plots we can see some
relationships between the predicted values and WIP
and
Precip
. The pattern in Precip
shows the
differences between the two study areas. with wetter areas having a
higher range of SOC stock
##
## Attaching package: 'ggpubr'
## The following object is masked from 'package:cowplot':
##
## get_legend
## The following object is masked from 'package:terra':
##
## rotate
We can now visualize our candidate model 2.1 in a predicted vs. actual plot
With the explicit hypothesized approach, we can only evaluate so many models. However, there may be interactions between variables that might be more powerful controls and could give insight into relationships between controls.
The dredge
function can be used to look through multiple
combinations of models from a globally defined model. The number of
parameters are limited so we run two models: one with the
NDYI
parameter and another with the MNDWI
parameter included with Temp
, Precip
, and
WIP
fully interacting. GEOLOGIC_AGE
is also
included as an additional, non-interaction parameter and random effects
are constant.
gmod1 <- lmer(log10(SOC_stock_spline) ~ WIP * Temp * Precip *
NDYI + GEOLOGIC_AGE + lower_depth + (lower_depth || sample_ID),
data = wa_spl_dat_scale, REML = F, na.action = "na.fail")
dredge1 <- dredge(gmod1, beta = "sd")
(Intercept) | GEOLOGIC_AGE | lower_depth | NDYI | Precip | Temp | WIP | NDYI:Precip | NDYI:Temp | NDYI:WIP | Precip:Temp | Precip:WIP | Temp:WIP | NDYI:Precip:Temp | NDYI:Precip:WIP | NDYI:Temp:WIP | Precip:Temp:WIP | NDYI:Precip:Temp:WIP | df | logLik | AICc | delta | weight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
24575 | 0 | NA | -0.3859947 | 0.7855336 | -0.1358170 | 0.3672985 | 0.1104673 | -0.3213815 | 0.4300134 | 0.1622869 | -0.1453279 | -0.2295751 | 0.0928943 | -0.5578764 | NA | 0.1604055 | NA | NA | 17 | -90.27224 | 216.9166 | 0.000000 | 0.2995148 |
48 | 0 |
|
-0.3836525 | 0.1317391 | 0.3088780 | NA | 0.2106672 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 13 | -95.41784 | 218.2250 | 1.308419 | 0.1557036 |
14335 | 0 | NA | -0.3851104 | 0.7752096 | -0.0954976 | 0.3346213 | 0.0855746 | -0.1921094 | 0.2650052 | 0.1003702 | -0.1163907 | -0.1221449 | NA | -0.5543885 | 0.1354660 | NA | NA | NA | 16 | -92.10787 | 218.3161 | 1.399547 | 0.1487683 |
57343 | 0 | NA | -0.3860028 | 0.7545347 | -0.1191113 | 0.3462410 | 0.1768655 | -0.3177870 | 0.4410082 | 0.1212566 | -0.1486504 | -0.1461895 | 0.0151525 | -0.5340127 | NA | 0.2205693 | -0.11752 | NA | 18 | -89.84743 | 218.3563 | 1.439757 | 0.1458072 |
1072 | 0 |
|
-0.3836264 | 0.1396997 | 0.2998503 | NA | 0.2037453 | NA | NA | NA | NA | -0.0801549 | NA | NA | NA | NA | NA | NA | 14 | -94.51771 | 218.6446 | 1.728044 | 0.1262346 |
32767 | 0 | NA | -0.3865633 | 0.7828342 | -0.1474136 | 0.3813838 | 0.1424312 | -0.3788402 | 0.4947691 | 0.1940555 | -0.1510079 | -0.2464318 | 0.0962508 | -0.5516318 | -0.1055494 | 0.2452163 | NA | NA | 18 | -90.00966 | 218.6808 | 1.764225 | 0.1239715 |
By using dredge
we find that there are a few candidate
models that fit the data well.
The difference between the model AICs is lower than a common
threshold of 2 but there is a difference between the top model and
others. This model contains multiple interactions including
NDYI:Precip
, NDYI:Temp
, NDYI:WIP
Precip:Temp
, Precip:WIP
,
Temp:WIP
,NDYI:Precip:Temp
We can visualize some of these interactions with interaction plots
against SOC stock. It looks like there is strong interaction and
separation with the NDYI
parameter and WIP
We can look at the bootstrapped confidence intervals to examine the
significance of the predictors between the two models. Interestingly the
WIP
parameter alone is not significant. And there is high
effect sizes for NDYI
, NDYI:Temp
, and
(NDYI × Precip) × Temp
Dredge Model | |||
---|---|---|---|
Predictors | Estimates | CI | p |
(Intercept) | -0.16 | -0.30 – -0.02 | 0.014 |
lower depth | -0.01 | -0.01 – -0.01 | <0.001 |
NDYI | 0.44 | 0.24 – 0.64 | <0.001 |
Precip | -0.07 | -0.27 – 0.12 | 0.428 |
Temp | 0.21 | 0.02 – 0.39 | 0.028 |
WIP | 0.06 | -0.01 – 0.13 | 0.074 |
NDYI × Precip | -0.20 | -0.33 – -0.06 | 0.002 |
NDYI × Temp | 0.25 | 0.08 – 0.42 | 0.006 |
NDYI × WIP | 0.09 | 0.01 – 0.18 | 0.038 |
Precip × Temp | -0.11 | -0.26 – 0.03 | 0.104 |
Precip × WIP | -0.14 | -0.27 – 0.00 | 0.064 |
Temp × WIP | 0.06 | -0.07 – 0.16 | 0.326 |
(NDYI × Precip) × Temp | -0.29 | -0.47 – -0.11 | 0.002 |
(NDYI × Temp) × WIP | 0.08 | 0.01 – 0.15 | 0.032 |
Random Effects | |||
σ2 | 0.05 | ||
τ00 sample_ID | 0.02 | ||
τ11 sample_ID.lower_depth | 0.00 | ||
ρ01 | |||
ρ01 | |||
ICC | 0.32 | ||
N sample_ID | 96 | ||
Observations | 276 | ||
Marginal R2 / Conditional R2 | 0.611 / 0.735 |
The R\(^{2}\) of the Dredge Model is R\(^{2}\) = 0.897
We now can examine the plot with the Dredge Model
Using Random Forest can see if a more flexible, machine learning model can capture any non-linear relationships or interaction relationships between parameters.
lower_depth
is included as a predictor here in the data
setup
We then use the tuneRF
to choose the appropriate
mtry
number
rf_model <- randomForest((SOC_stock_spline) ~ ., ntree = 1000,
mtry = 4, importance = TRUE, data = full)
plot(rf_model)
rf.full <- predict(rf_model, newdata = full)
vip::vip(rf_model)
61.63% of the out of bag (OOB) variation is explained. Looks
likePrecip
, lower_depth
, WIP
, and
NDYI
are the big drivers.
error.variable | value |
---|---|
MAE | 0.1306526 |
MAE^2 | 0.0170701 |
RMSE | 0.1768086 |
RMSE non-log | 1.1934027 |
Stdev of non-log SOC stock | 0.4999442 |
R^2 on full dataset | 0.9230161 |
Looking at partial dependency plots we can see how
Precip
and NDYI
have higher influence on
separating variation in model predictions. WIP
less so but
seems to play a role on the higher end.
Here, we show the fit between sampled and predicted SOC stocks for the full Random Forest Model