Agriculture, Vol. 15, Pages 2008: Enhanced Spatially Explicit Modeling of Soil Particle Size and Texture Classification Using a Novel Two-Point Machine Learning Hybrid Framework


Agriculture, Vol. 15, Pages 2008: Enhanced Spatially Explicit Modeling of Soil Particle Size and Texture Classification Using a Novel Two-Point Machine Learning Hybrid Framework

Agriculture doi: 10.3390/agriculture15192008

Authors:
Liya Qin
Zong Wang
Xiaoyuan Zhang

Accurately predicting soil particle size fractions (PSFs) and classifying soil texture types are essential for soil resource assessment and sustainable land management. PSFs, comprising clay, silt, and sand, form a compositional dataset constrained to sum to 100%. The practical implications of incorporating compositional data characteristics into PSF mapping remain insufficiently explored. This study applies a two-point machine learning (TPML) model, integrating spatial autocorrelation and attribute similarity, to enhance both the quantitative prediction of PSFs and the categorical classification of soil texture types in the Heihe River Basin, China. TPML was compared with random forest regression kriging (RFRK), random forest (RF), XGBoost, and ordinary kriging (OK), and a novel TPML-C model was developed for multi-class classification tasks. Results show that TPML achieved R2 values of 0.58, 0.55, and 0.64 for clay, silt, and sand, respectively. Among all models, the ALR_TPML predictions showed the most consistent agreement with the observed variability, with predicted ranges of 2.63–98.28% for silt, 0.26–36.16% for clay, and 0.64–96.90% for sand. Across all models, the dominant soil texture types were identified as Sandy Loam (SaLo), Loamy Sand (LoSa), and Silty Loam (SiLo). For soil texture classification, TPML with raw, ALR-, and ILR-transformed data reached right ratios of 61.09%, 55.78%, and 60.00%, correctly identifying 25, 26, and 27 types out of 43. TPML with raw data exhibited strong performance in both regression and classification, with superior ability to separate ambiguous boundaries. Log-ratio transformations, particularly ILR, further improved classification performance by addressing the constraints of compositional data. These findings demonstrate the promise of hybrid machine learning approaches for digital soil mapping and precision agriculture.



Source link

Liya Qin www.mdpi.com