Mapping soil organic carbon (SOC) as a function climatic, topographic, and geologic factors across spatial extents is key for identifying vulnerable SOC stocks. Yet, large, vulnerable SOC stocks are contained in the organic soils of wetlands and can occur in close proximity to non-wetland soils but are controlled and function differently due to saturated soil conditions. Thus, wetland SOC is often omitted or underrepresented in SOC mapping and modeling making it difficult to assess controls compared to non-wetland soil types. Here, we use geospatial data and SOC stocks from 481 soil horizons in 96 pedons in both wetlands and non-wetlands across three ecoregion study areas of the Pacific Northwest to assess climatic, geologic, and topographic controls on SOC.
WIP is a significant predictor of SOC stock
WIP interaction with climate (Precip & Temp) is a significant predictor of SOC stock
Geology is a significant predictor of SOC stock
This dataset has observations of SOC% and SOC stock for 96 locations in Washington state. There are geospatial covariates as well that are used to build models to predict SOC% and/or SOC stock
## 'data.frame': 481 obs. of 25 variables:
## $ X : int 1 2 3 4 5 6 7 8 9 10 ...
## $ sample_ID : chr "H-C-01-DNR-WET" "H-C-01-DNR-WET" "H-C-01-DNR-WET" "H-C-01-DNR-WET" ...
## $ site : chr "HOH" "HOH" "HOH" "HOH" ...
## $ depth_cm : int 13 40 76 104 11 30 45 75 9 27 ...
## $ helper : chr "H-C-01-DNR-WET|13" "H-C-01-DNR-WET|40" "H-C-01-DNR-WET|76" "H-C-01-DNR-WET|104" ...
## $ thickness_cm : int 13 27 36 28 11 19 15 30 9 18 ...
## $ rock_perc : num 0 0 5 40 0 15 40 55 0 0 ...
## $ BD_g_cm3 : num 0.07 0.16 0.51 0.88 0.38 0.49 0.7 0.57 0.72 0.95 ...
## $ field_texture : chr "P" "MP" "MM" "C" ...
## $ field_texture_binned: chr "O" "O" "O" "C" ...
## $ redox : chr NA NA NA NA ...
## $ carbon_perc : num 39.56 38.31 12.14 2.37 15.65 ...
## $ carbon_stock_g_cm2 : num 0.376 1.678 2.105 0.352 0.65 ...
## $ NDYI : num 0.543 0.543 0.543 0.543 0.579 ...
## $ Geomorphon_Class : int 10 10 10 10 10 10 10 10 10 10 ...
## $ Temp : num 10.2 10.2 10.2 10.2 10.2 ...
## $ Precip : num 3070 3070 3070 3070 3054 ...
## $ NDVI : num 0.84 0.84 0.84 0.84 0.862 ...
## $ MNDWI : num -0.477 -0.477 -0.477 -0.477 -0.446 ...
## $ EVI : num 0.455 0.455 0.455 0.455 0.534 ...
## $ SCI : num 0.649 0.649 0.649 0.649 0.648 ...
## $ GEOLOGIC_AGE : chr "Pleistocene" "Pleistocene" "Pleistocene" "Pleistocene" ...
## $ WIP : num 0.873 0.873 0.873 0.873 0.896 ...
## $ x : num 411350 411350 411350 411350 409931 ...
## $ y : num 5295086 5295086 5295086 5295086 5294415 ...
Examining covariate predictors using correlation plots
There is weak correlation between carbon_stock_g_cm2
and
the selected covariates. Additionally there is some collinearity between
predictors. The spectral predictors such as EVI (Enhanced Vegetation
Index) and SCI (Soil Condition Index) are well correlated with each
other and also with NDVI (Normalized Difference Vegetation Index). We
can remove these based on the correlation coefficient > 0.7.
Now we can begin examining the distribution of carbon stock values in the dataset to choose the appropriate transformation. We also scale and center the numeric predictor variables.
## 'data.frame': 481 obs. of 25 variables:
## $ X : num [1:481, 1] -1.73 -1.72 -1.71 -1.71 -1.7 ...
## ..- attr(*, "scaled:center")= num 241
## ..- attr(*, "scaled:scale")= num 139
## $ sample_ID : chr "H-C-01-DNR-WET" "H-C-01-DNR-WET" "H-C-01-DNR-WET" "H-C-01-DNR-WET" ...
## $ site : chr "HOH" "HOH" "HOH" "HOH" ...
## $ depth_cm : num [1:481, 1] -0.983 -0.306 0.596 1.297 -1.033 ...
## ..- attr(*, "scaled:center")= num 52.2
## ..- attr(*, "scaled:scale")= num 39.9
## $ helper : chr "H-C-01-DNR-WET|13" "H-C-01-DNR-WET|40" "H-C-01-DNR-WET|76" "H-C-01-DNR-WET|104" ...
## $ thickness_cm : num [1:481, 1] -0.501 0.443 1.05 0.51 -0.636 ...
## ..- attr(*, "scaled:center")= num 20.4
## ..- attr(*, "scaled:scale")= num 14.8
## $ rock_perc : num [1:481, 1] -0.693 -0.693 -0.477 1.031 -0.693 ...
## ..- attr(*, "scaled:center")= num 16.1
## ..- attr(*, "scaled:scale")= num 23.2
## $ BD_g_cm3 : num [1:481, 1] -1.805 -1.516 -0.391 0.797 -0.809 ...
## ..- attr(*, "scaled:center")= num 0.632
## ..- attr(*, "scaled:scale")= num 0.311
## $ field_texture : chr "P" "MP" "MM" "C" ...
## $ field_texture_binned: chr "O" "O" "O" "C" ...
## $ redox : chr NA NA NA NA ...
## $ carbon_perc : num 39.56 38.31 12.14 2.37 15.65 ...
## $ carbon_stock_g_cm2 : num 0.376 1.678 2.105 0.352 0.65 ...
## $ NDYI : num [1:481, 1] 1.51 1.51 1.51 1.51 1.93 ...
## ..- attr(*, "scaled:center")= num 0.414
## ..- attr(*, "scaled:scale")= num 0.0858
## $ Geomorphon_Class : num [1:481, 1] 0.777 0.777 0.777 0.777 0.777 ...
## ..- attr(*, "scaled:center")= num 8.02
## ..- attr(*, "scaled:scale")= num 2.55
## $ Temp : num [1:481, 1] 1.14 1.14 1.14 1.14 1.16 ...
## ..- attr(*, "scaled:center")= num 8.33
## ..- attr(*, "scaled:scale")= num 1.61
## $ Precip : num [1:481, 1] 1.23 1.23 1.23 1.23 1.21 ...
## ..- attr(*, "scaled:center")= num 1804
## ..- attr(*, "scaled:scale")= num 1031
## $ NDVI : num [1:481, 1] 0.138 0.138 0.138 0.138 0.407 ...
## ..- attr(*, "scaled:center")= num 0.829
## ..- attr(*, "scaled:scale")= num 0.0823
## $ MNDWI : num [1:481, 1] 0.475 0.475 0.475 0.475 1.154 ...
## ..- attr(*, "scaled:center")= num -0.499
## ..- attr(*, "scaled:scale")= num 0.0467
## $ EVI : num [1:481, 1] -0.171 -0.171 -0.171 -0.171 0.645 ...
## ..- attr(*, "scaled:center")= num 0.471
## ..- attr(*, "scaled:scale")= num 0.0965
## $ SCI : num [1:481, 1] -0.0964 -0.0964 -0.0964 -0.0964 -0.1109 ...
## ..- attr(*, "scaled:center")= num 0.654
## ..- attr(*, "scaled:scale")= num 0.0511
## $ GEOLOGIC_AGE : chr "Pleistocene" "Pleistocene" "Pleistocene" "Pleistocene" ...
## $ WIP : num [1:481, 1] 1.52 1.52 1.52 1.52 1.61 ...
## ..- attr(*, "scaled:center")= num 0.436
## ..- attr(*, "scaled:scale")= num 0.286
## $ x : num [1:481, 1] -1.07 -1.07 -1.07 -1.07 -1.08 ...
## ..- attr(*, "scaled:center")= num 610372
## ..- attr(*, "scaled:scale")= num 185155
## $ y : num [1:481, 1] -0.0158 -0.0158 -0.0158 -0.0158 -0.0238 ...
## ..- attr(*, "scaled:center")= num 5296406
## ..- attr(*, "scaled:scale")= num 83700
Now build models using log transformed carbon stock data. We need to
specify that sample_ID
is a random effect because of the
multiple samples at one location. depth
can be a random
slope to adjust model based on how it is affected by depth. We could use
a Generalized Linear Model or a Generalized Linear Mixed Model here too
but they often fail to converge. I am starting with the log-transformed
linear mixed model first to test the hypotheses
# Full model with hypothesized parameters
mod1 <- lmer(log(carbon_stock_g_cm2) ~ WIP * Precip * Temp +
MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) |
sample_ID), data = wa_dat_scale, REML = F)
# No interactions
mod2 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) |
sample_ID), data = wa_dat_scale, REML = F)
# No spectral
mod3 <- lmer(log(carbon_stock_g_cm2) ~ WIP * Precip * Temp +
(GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No geology
mod4 <- lmer(log(carbon_stock_g_cm2) ~ WIP * Precip * Temp +
MNDWI + NDVI + NDYI + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No WIP w/ interaction
mod5 <- lmer(log(carbon_stock_g_cm2) ~ Precip * Temp + MNDWI +
NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID),
data = wa_dat_scale, REML = F)
# No WIP w/o interaction
mod6 <- lmer(log(carbon_stock_g_cm2) ~ Precip + Temp + MNDWI +
NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID),
data = wa_dat_scale, REML = F)
# No climate
mod7 <- lmer(log(carbon_stock_g_cm2) ~ WIP + MNDWI + NDVI + NDYI +
(GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No climate No spectral
mod8 <- lmer(log(carbon_stock_g_cm2) ~ WIP + (GEOLOGIC_AGE) +
(1 + (depth_cm) | sample_ID), data = wa_dat_scale, REML = F)
# Just WIP
mod9 <- lmer(log(carbon_stock_g_cm2) ~ WIP + (1 + (depth_cm) |
sample_ID), data = wa_dat_scale, REML = F)
Pairwise comparisons between the top, global model and the rest. Note the table contains mod1 multiple times and has been encoded with an additional number
Models | npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | Significant |
---|---|---|---|---|---|---|---|---|---|
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 4.541993 | 4 | 0.3375972 | |
mod3 | 17 | 1397.114 | 1468.104 | -681.5572 | 1363.114 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 4.930755 | 3 | 0.1769384 | |
mod4 | 15 | 1402.769 | 1465.407 | -686.3847 | 1372.769 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 14.585647 | 5 | 0.0122876 |
|
mod5 | 16 | 1406.090 | 1472.904 | -687.0450 | 1374.090 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 15.906251 | 4 | 0.0031476 | ** |
mod6 | 15 | 1405.705 | 1468.343 | -687.8523 | 1375.705 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 17.521028 | 5 | 0.0036107 | ** |
mod7 | 14 | 1402.863 | 1461.325 | -687.4314 | 1374.863 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 16.679065 | 6 | 0.0105380 |
|
mod8 | 11 | 1424.304 | 1470.238 | -701.1519 | 1402.304 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 44.120165 | 9 | 0.0000013 | *** |
mod9 | 6 | 1422.216 | 1447.271 | -705.1081 | 1410.216 | NA | NA | NA | |
mod1 | 20 | 1398.184 | 1481.701 | -679.0918 | 1358.184 | 52.032483 | 14 | 0.0000028 | *** |
From the ANOVAs it looks like model 2
&
model 3
performed just as well as
model 1
meaning we should proceed with these.
model 2
removed interactions and model 3
removed all the spectral metrics but kept interaction. However,
model 2
has 1 less parameter than model 3
which gives it a lower AIC when compared together.
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | |
---|---|---|---|---|---|---|---|---|
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | NA | NA | NA |
mod3 | 17 | 1397.114 | 1468.104 | -681.5572 | 1363.114 | 0 | 1 | 1 |
# No MNDWI
mod2.1 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID),
data = wa_dat_scale, REML = F)
# No MNDWI, No NDVI
mod2.2 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No NDYI, No NDVI
mod2.3 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No NDYI, No MNDWI
mod2.4 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
NDVI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No NDVI
mod2.5 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
NDYI + MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID),
data = wa_dat_scale, REML = F)
# No NDYI
mod2.6 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
NDVI + MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID),
data = wa_dat_scale, REML = F)
# No spectral
mod2.7 <- lmer(log(carbon_stock_g_cm2) ~ WIP + Precip + Temp +
(GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
Models | npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | Significant |
---|---|---|---|---|---|---|---|---|---|
mod2.1 | 15 | 1393.356 | 1455.994 | -681.6780 | 1363.356 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 0.6303396 | 1 | 0.4272308 | |
mod2.2 | 14 | 1391.475 | 1449.937 | -681.7375 | 1363.475 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 0.7492708 | 2 | 0.6875399 | |
mod2.3 | 14 | 1391.954 | 1450.417 | -681.9772 | 1363.954 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 1.2287240 | 2 | 0.5409859 | |
mod2.4 | 14 | 1394.233 | 1452.695 | -683.1163 | 1366.233 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 3.5070131 | 2 | 0.1731657 | |
mod2.5 | 15 | 1392.893 | 1455.531 | -681.4467 | 1362.893 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 0.1677554 | 1 | 0.6821144 | |
mod2.6 | 15 | 1393.914 | 1456.552 | -681.9568 | 1363.914 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 1.1879664 | 1 | 0.2757401 | |
mod2.7 | 13 | 1393.302 | 1447.588 | -683.6508 | 1367.302 | NA | NA | NA | |
mod2 | 16 | 1394.726 | 1461.540 | -681.3628 | 1362.726 | 4.5758474 | 3 | 0.2056238 |
Model 2.2 and 2.3 are the lowest AIC and not significantly different
from the fit in Model 2. The choice is either between MNDWI
or NDYI
to be included in the model with WIP
,
Precip
, Temp
, and Geology.
To
test , I want to see if WIP
interacts with these
variables.
# No MNDWI, No NDVI
mod2.2.1 <- lmer(log(carbon_stock_g_cm2) ~ Precip + Temp + WIP:NDYI +
(GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
# No NDYI, No NDVI
mod2.3.1 <- lmer(log(carbon_stock_g_cm2) ~ Precip + Temp + WIP:MNDWI +
(GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F)
npar | AIC | BIC | logLik | deviance | Chisq | Df | Pr(>Chisq) | Significant | |
---|---|---|---|---|---|---|---|---|---|
mod2.2.1 | 13 | 1409.712 | 1463.998 | -691.8560 | 1383.712 | NA | NA | NA | |
mod2.2 | 14 | 1391.475 | 1449.937 | -681.7375 | 1363.475 | 20.23698 | 1 | 6.8e-06 | *** |
mod2.3.1 | 13 | 1409.469 | 1463.755 | -691.7343 | 1383.469 | NA | NA | NA | |
mod2.3 | 14 | 1391.954 | 1450.417 | -681.9772 | 1363.954 | 19.51414 | 1 | 1.0e-05 | *** |
Doesn’t look like there are any significant interactions that improve model fit
Now we can look at the table of all models and compare AICs
AIC | formula | name | delta |
---|---|---|---|
1398.18 | WIP * Precip * Temp + MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod1 | 6.71 |
1394.73 | WIP + Precip + Temp + MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2 | 3.26 |
1397.11 | WIP * Precip * Temp + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod3 | 5.64 |
1402.77 | WIP * Precip * Temp + MNDWI + NDVI + NDYI + (1 + (depth_cm) | sample_ID) | mod4 | 11.30 |
1406.09 | Precip * Temp + MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod5 | 14.62 |
1405.7 | Precip + Temp + MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod6 | 14.23 |
1402.86 | WIP + MNDWI + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod7 | 11.39 |
1424.3 | WIP + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod8 | 32.83 |
1422.22 | WIP + (1 + (depth_cm) | sample_ID) | mod9 | 30.75 |
1393.36 | WIP + Precip + Temp + NDVI + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.1 | 1.89 |
1391.47 | WIP + Precip + Temp + NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.2 | 0.00 |
1391.95 | WIP + Precip + Temp + MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.3 | 0.48 |
1394.23 | WIP + Precip + Temp + NDVI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.4 | 2.76 |
1392.89 | WIP + Precip + Temp + NDYI + MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.5 | 1.42 |
1393.91 | WIP + Precip + Temp + NDVI + MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.6 | 2.44 |
1393.3 | WIP + Precip + Temp + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.7 | 1.83 |
1409.71 | Precip + Temp + WIP:NDYI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.2.1 | 18.24 |
1409.47 | Precip + Temp + WIP:MNDWI + (GEOLOGIC_AGE) + (1 + (depth_cm) | sample_ID) | mod2.3.1 | 18.00 |
Looks like the best model fit to the data according to AIC is the
mod2.2
or mod2.3
which do not include any
interactions. The model \(R^2\) for
mod2.2
is 0.699 compared to the \(R^2\) for mod2.3
is 0.701
Confidence intervals calculated with bootstrapping between these two
models show that NDYI
is a significant predictor compared
to MNDWI
. Therefore, we use mod2.2
as our
candidate model
Model 2.2 | Model 2.3 | |||||
---|---|---|---|---|---|---|
Predictors | Estimates | CI | p | Estimates | CI | p |
(Intercept) | -1.21 | -1.91 – -0.51 | 0.002 | -1.27 | -1.98 – -0.56 | 0.002 |
WIP | 0.24 | 0.12 – 0.38 | <0.001 | 0.28 | 0.15 – 0.40 | <0.001 |
Precip | 0.35 | 0.10 – 0.59 | 0.006 | 0.48 | 0.28 – 0.69 | <0.001 |
Temp | -0.01 | -0.24 – 0.21 | 0.912 | -0.03 | -0.26 – 0.20 | 0.772 |
NDYI | 0.16 | 0.00 – 0.33 | 0.046 | |||
GEOLOGIC AGE [Miocene-Eocene] |
-1.08 | -2.09 – -0.12 | 0.038 | -1.14 | -2.13 – -0.17 | 0.028 |
GEOLOGIC AGE [Oligocene-Eocene] |
0.21 | -0.54 – 0.95 | 0.546 | 0.26 | -0.50 – 1.00 | 0.462 |
GEOLOGIC AGE [Pleistocene] |
-0.04 | -0.79 – 0.66 | 0.908 | 0.00 | -0.73 – 0.70 | 0.994 |
GEOLOGIC AGE [pre-Tertiary] |
0.02 | -0.97 – 0.92 | 0.974 | 0.03 | -0.96 – 0.96 | 0.942 |
GEOLOGIC AGE [Quaternary] | -0.59 | -1.45 – 0.26 | 0.168 | -0.67 | -1.49 – 0.18 | 0.122 |
MNDWI | 0.11 | -0.01 – 0.23 | 0.070 | |||
Random Effects | ||||||
σ2 | 0.64 | 0.64 | ||||
τ00 | 0.65 sample_ID | 0.61 sample_ID | ||||
τ11 | 0.51 sample_ID.depth_cm | 0.50 sample_ID.depth_cm | ||||
ρ01 | 0.94 sample_ID | 0.93 sample_ID | ||||
ICC | 0.51 | 0.49 | ||||
N | 96 sample_ID | 96 sample_ID | ||||
Observations | 481 | 481 | ||||
Marginal R2 / Conditional R2 | 0.174 / 0.591 | 0.182 / 0.582 |
We can now visualize our candidate model 2.2.
The dredge
function can be used to look through multiple
combinations of models from a globally defined model. The number of
parameters are limited so we run two models: one with the
NDYI
parameter and another with the MNDWI
parameter included with Temp
, Precip
, and
WIP
fully interacting. GEOLOGIC_AGE
is also
included as an additional, non-interaction parameter and random effects
are constant.
gmod1 <- lmer(log(carbon_stock_g_cm2) ~ WIP * Temp * Precip *
NDYI + GEOLOGIC_AGE + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F, na.action = "na.fail")
gmod2 <- lmer(log(carbon_stock_g_cm2) ~ WIP * Temp * Precip *
MNDWI + GEOLOGIC_AGE + (1 + (depth_cm) | sample_ID), data = wa_dat_scale,
REML = F, na.action = "na.fail")
dredge1 <- dredge(gmod1, beta = "sd")
dredge2 <- dredge(gmod2, beta = "sd")
(Intercept) | GEOLOGIC_AGE | NDYI | Precip | Temp | WIP | NDYI:Precip | NDYI:Temp | NDYI:WIP | Precip:Temp | Precip:WIP | Temp:WIP | NDYI:Precip:Temp | NDYI:Precip:WIP | NDYI:Temp:WIP | Precip:Temp:WIP | NDYI:Precip:Temp:WIP | df | logLik | AICc | delta | weight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
24 | 0 |
|
0.1299973 | 0.2590203 | NA | 0.1872057 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 13 | -681.7415 | 1390.262 | 0.0000000 | 0.3260432 |
536 | 0 |
|
0.1361008 | 0.2516699 | NA | 0.1831620 | NA | NA | NA | NA | -0.0527466 | NA | NA | NA | NA | NA | NA | 14 | -681.1778 | 1391.257 | 0.9944499 | 0.1983047 |
22 | 0 |
|
NA | 0.3486281 | NA | 0.2158639 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 12 | -683.7578 | 1392.182 | 1.9198364 | 0.1248498 |
56 | 0 |
|
0.1448841 | 0.2478062 | NA | 0.1834030 | -0.022364 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 14 | -681.6511 | 1392.204 | 1.9410966 | 0.1235297 |
152 | 0 |
|
0.1270359 | 0.2618069 | NA | 0.1888159 | NA | NA | -0.0070658 | NA | NA | NA | NA | NA | NA | NA | NA | 14 | -681.7318 | 1392.365 | 2.1023649 | 0.1139600 |
32 | 0 |
|
0.1289036 | 0.2660776 | -0.0081215 | 0.1868638 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 14 | -681.7375 | 1392.376 | 2.1137575 | 0.1133127 |
(Intercept) | GEOLOGIC_AGE | MNDWI | Precip | Temp | WIP | MNDWI:Precip | MNDWI:Temp | MNDWI:WIP | Precip:Temp | Precip:WIP | Temp:WIP | MNDWI:Precip:Temp | MNDWI:Precip:WIP | MNDWI:Temp:WIP | Precip:Temp:WIP | MNDWI:Precip:Temp:WIP | df | logLik | AICc | delta | weight | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
56 | 0 |
|
0.0668338 | 0.3640831 | NA | 0.2067567 | 0.0735801 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 14 | -680.9016 | 1390.704 | 0.0000000 | 0.24640525 |
24 | 0 |
|
0.0907875 | 0.3500011 | NA | 0.2136153 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 13 | -682.0105 | 1390.800 | 0.0959663 | 0.23486113 |
568 | 0 |
|
0.0722226 | 0.3608701 | NA | 0.2036170 | 0.0747569 | NA | NA | NA | -0.0550311 | NA | NA | NA | NA | NA | NA | 15 | -680.2725 | 1391.577 | 0.8727253 | 0.15927223 |
536 | 0 |
|
0.0965784 | 0.3466353 | NA | 0.2106885 | NA | NA | NA | NA | -0.0544050 | NA | NA | NA | NA | NA | NA | 14 | -681.4154 | 1391.732 | 1.0276861 | 0.14739772 |
22 | 0 |
|
NA | 0.3486281 | NA | 0.2158639 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 12 | -683.7578 | 1392.182 | 1.4778537 | 0.11768961 |
64 | 0 |
|
0.0634709 | 0.3963320 | -0.0408569 | 0.2036933 | 0.0765553 | NA | NA | NA | NA | NA | NA | NA | NA | NA | NA | 15 | -680.7958 | 1392.624 | 1.9194222 | 0.09437407 |
By using dredge
we find that there are a few candidate
models that fit the data well.
The difference between the model AICs is negligible. The first
includes no interactions between the parameters and removes
Temp
as a predictor. The second includes an interaction
between MNDWI
and Precip
while also excluding
Temp
.
Dredge Model 1 | Dredge Model 2 | |||||
---|---|---|---|---|---|---|
Predictors | Estimates | CI | p | Estimates | CI | p |
(Intercept) | -1.21 | -1.86 – -0.52 | 0.002 | -1.25 | -1.89 – -0.56 | 0.002 |
GEOLOGIC AGE [Miocene-Eocene] |
-1.09 | -2.10 – -0.14 | 0.036 | -1.23 | -2.22 – -0.23 | 0.010 |
GEOLOGIC AGE [Oligocene-Eocene] |
0.21 | -0.53 – 0.92 | 0.554 | 0.25 | -0.48 – 0.97 | 0.472 |
GEOLOGIC AGE [Pleistocene] |
-0.05 | -0.77 – 0.60 | 0.872 | -0.04 | -0.73 – 0.63 | 0.916 |
GEOLOGIC AGE [pre-Tertiary] |
0.00 | -1.01 – 0.90 | 0.994 | -0.03 | -1.02 – 0.86 | 0.960 |
GEOLOGIC AGE [Quaternary] | -0.63 | -1.43 – 0.23 | 0.154 | -0.72 | -1.46 – 0.09 | 0.080 |
NDYI | 0.16 | 0.01 – 0.33 | 0.038 | |||
Precip | 0.34 | 0.17 – 0.50 | <0.001 | 0.47 | 0.35 – 0.59 | <0.001 |
WIP | 0.24 | 0.12 – 0.38 | <0.001 | 0.27 | 0.15 – 0.40 | <0.001 |
MNDWI | 0.09 | -0.04 – 0.21 | 0.194 | |||
MNDWI × Precip | 0.09 | -0.04 – 0.20 | 0.152 | |||
Random Effects | ||||||
σ2 | 0.64 | 0.64 | ||||
τ00 | 0.65 sample_ID | 0.58 sample_ID | ||||
τ11 | 0.51 sample_ID.depth_cm | 0.50 sample_ID.depth_cm | ||||
ρ01 | 0.94 sample_ID | 0.93 sample_ID | ||||
ICC | 0.50 | 0.47 | ||||
N | 96 sample_ID | 96 sample_ID | ||||
Observations | 481 | 481 | ||||
Marginal R2 / Conditional R2 | 0.175 / 0.591 | 0.193 / 0.576 |
Dredge Model 1 includes NDYI
, Precip
,
WIP
, and GEOLOGIC_AGE
as significant
predictors. Dredge Model 2 does not have significant confidence
intervals for the MNDWI
and MNDWIxPrecip
parameters.
There is also minimal differences between the R\(^{2}\) between the two models where Dredge Model 1 has R\(^{2}\) = 0.699 and Dredge Model 2 has R\(^{2}\) = 0.698
We now can examine the plot with the Dredge Model 1
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'
### Random Forest Machine Learning
I also want to try Random Forest in order to see if a more flexible, machine learning model can capture any non-linear relationships with SOC and other variables.
depth_cm
is included as a predictor here in the data
setup
## mtry = 3 OOB error = 0.8515942
## Searching left ...
## mtry = 2 OOB error = 0.8907029
## -0.04592407 1e-04
## Searching right ...
## mtry = 4 OOB error = 0.8483588
## 0.003799248 1e-04
## mtry = 6 OOB error = 0.8291902
## 0.02259492 1e-04
## mtry = 9 OOB error = 0.8306786
## -0.001794988 1e-04
## mtry OOBError
## 2 2 0.8907029
## 3 3 0.8515942
## 4 4 0.8483588
## 6 6 0.8291902
## 9 9 0.8306786
rf_model <- randomForest((carbon_stock_g_cm2) ~ ., ntree = 1000,
mtry = 6, importance = TRUE, data = full)
rf_model
##
## Call:
## randomForest(formula = (carbon_stock_g_cm2) ~ ., data = full, ntree = 1000, mtry = 6, importance = TRUE)
## Type of random forest: regression
## Number of trees: 1000
## No. of variables tried at each split: 6
##
## Mean of squared residuals: 0.8269377
## % Var explained: 50.93
plot(rf_model)
varImpPlot(rf_model)
~50% of the out of bag (OOB) variation is explained. Looks like after
depth_cm
Precip
, Temp
,
WIP
, and NDYI
are the big drivers. The RF
model does not appear to have
error.variable | value |
---|---|
MAE | 0.3397694 |
MAE^2 | 0.1154432 |
RMSE | 0.4511324 |
RMSE non-log | 1.5700892 |
Stdev of non-log SOC stock | 0.4698826 |
R^2 on full dataset | 0.8994381 |
Look at partial dependency plots
##
## Attaching package: 'gridExtra'
## The following object is masked from 'package:randomForest':
##
## combine
## The following object is masked from 'package:dplyr':
##
## combine
## The following object is masked from 'package:spatialEco':
##
## combine
##
## Attaching package: 'grid'
## The following object is masked from 'package:spatialEco':
##
## explode
## The following object is masked from 'package:terra':
##
## depth
## [1] "1.578 mean square error rf model from training"
## [1] "0.113 R^2 rf model from training"
## [1] "0.07 R^2 rf model from testing "
Here, we show the fit between sampled and predicted SOC stocks for the full Random Forest Model
## `geom_smooth()` using formula = 'y ~ x'
## `geom_smooth()` using formula = 'y ~ x'