Predicting soil aggregate stability using readily available soil properties and machine learning techniques
Revista : CatenaVolumen : 187
Páginas : 104408
Tipo de publicación : ISI Ir a publicación
Abstract
Aggregate stability is a measurement of soil quality, as the presence of stable aggregates relates to a wide rangeof soil ecosystem services. However, aggregates stability is not reported in most soil surveys, so predictivemodels have been focus of increasing attention as an alternative method in the absence of direct measurements.Therefore, the objective of this study was to develop a new model for predicting aggregate stability, using twomachine learning techniques: An Artificial Neural Network (ANN) model and Generalized Linear Model (GLM).These techniques were applied to a soil dataset described in terms of soil texture, organic matter content, pH,and water-stable aggregates. This dataset included 109 soil samples obtained at 017 cm soil depth from hy-perarid, arid, semiarid, and humid regions in Chile, including agricultural soils, shrubland, and forestland. Mostsoil textures in this dataset were sandy loam, loam, and clay loam, and each soil property had a large range ofvalues. Aggregate stability was measured and computed as the percentage of water-stable aggregates using a wetsieving apparatus, and the ANN and GLM models were constructed and evaluated by repeated cross-validation(80% and 20% of dataset for training and testing, respectively). The ANN and GLM models were compared bycomputing the modified r2 (r2 ), Root Mean Square Error (RMSE), and Mean Absolute Error (MAE). The results adjdemonstrated a positive gradient of aggregate stability from arid (40% in average) to humid (87% in average)regions, which is related to the increase in organic matter content and decrease in pH. Organic matter contentand pH exhibited a significant correlation to the aggregate stability, with r = 0.56 and r = −0.73, respectively.Moreover, among the fractions used to compute the soil texture, the clay content exhibited the highest corre-lation with aggregate stability (r = 0.30). These variables were used for training and testing the ANN and GLMmodels. The ANN model achieved superior performance in terms of the RMSE, (r2 ) and MAE in the cross- 2 2 adj 2validation procedure, and showed r = 0.80 for training and r = 0.82 for testing. The GLM yielded r = 0.59 and r2 = 0.63 for training and testing, respectively. Therefore, despite the limitations observed when im- plementing ANN, its use is recommended instead of GLM as a reference model. Considering the small number of easily measured variables, this study provides two models that can be coupled with other existing soil routines or can be used directly to complete soil surveys where the aggregate stability was not measured.