Gully erosion possess a serious hazard to critical resources such as soil, water, and vegetation cover within watersheds. Therefore, spatial maps of gully erosion hazards can be instrumental in mitigating its negative consequences. Among the various methods used to explore and map gully erosion, advanced learning techniques, especially deep learning (DL) models, are highly capable of spatial mapping and can provide accurate predictions for generating spatial maps of gully erosion at different scales (e.g., local, regional, continental, and global). In this paper, we applied two DL models, namely a simple recurrent neural network (RNN) and a gated recurrent unit (GRU), to map land susceptibility to gully erosion in the Shamil-Minab plain, Hormozgan province, southern Iran. To address the inherent black box nature of DL models, we applied three novel interpretability methods consisting of SHaply Additive explanation (SHAP), ceteris paribus and partial dependence (CP-PD) profiles and permutation feature importance (PFI). Using the Boruta algorithm, we identified seven important features that control gully erosion: soil bulk density, clay content, elevation, land use type, vegetation cover, sand content, and silt content. These features, along with an inventory map of gully erosion (based on a 70 % training dataset and 30 % test dataset), were used to generate spatial maps of gully erosion using DL models. According to the Kolmogorov-Smirnov (KS) statistic performance assessment measure, the simple RNN model (with KS = 91.6) outperformed the GRU model (with KS = 66.6). Based on the results from the simple RNN model, 7.4 %, 14.5 %, 18.9 %, 31.2 % and 28 % of total area of the plain were classified as very-low, low, moderate, high and very-high hazard classes, respectively. According to SHAP plots, CP-PD profiles, and PFI measures, soil silt content, vegetation cover (NDVI) and land use type had the highest impact on the model's output. Overall, the DL modelling techniques and interpretation methods used in this study proved to be helpful in generating spatial maps of soil erosion hazard, especially gully erosion. Their interpretability can support watershed sustainable management.
Soil erosion by wind poses a significant threat to various regions across the globe, such as drylands in the Middle East and Iran. Wind erosion hazard maps can assist in identifying the regions of highest wind erosion risk and are a valuable tool for the mitigation of its destructive consequences. This study aims to map wind erosion hazards by developing an interpretable (explainable) model based on machine learning (ML) and Shapley additive exPlanation (SHAP) interpretation techniques. Four ML models, namely random forest (RF), support vector machine (SVM), extreme gradient boosting (XGB), and quadratic discriminant analysis (QDA) were used. Thirteen features associated with wind erosion were mapped spatially and then subjected to a multivariate adaptive regression spline (MARS) feature selection algorithm, and then, tolerance coefficient (TC) and variance inflation factor (VIF) statistical tests were used to explore multicollinearity among the variables. MARS analysis shows that eight features consisting of elevation (or DEM), soil bulk density, precipitation, aspect, slope, soil sand content, vegetation cover (or NDVI), and lithology were the most effective for wind erosion, while no collinearity existed among these variables. The ML models were used for ranking the effective features, and the research introduces the application of an interpretable ML model for the interpretation of predictive model's output. The ranking of effective features by RF-as the most typical ML model-revealed that elevation and soil bulk density were the two most important features. According to the area under the receiver operating characteristic curve (AUROC) (with a value > 90%) and precision-recall (PR) (with a value > 90%) curves, all four ML models performed with great accuracy. According to the PR curve, the SVM model performed slightly better than others, and its results revealed that 20.9%, 23%, and 16.6% of the total area in Hormozgan Province is characterized by moderate, high, and very high hazard classes to wind erosion, respectively. SHAP revealed that soil sand content and elevation are the most important variables contributing to the predictive model output. Overall, our research is one of the pioneering applications of interpretable ML models in mapping wind erosion hazards in Southern Iran. We recommend that future research should address the aspect of interpretability in order to better understand predictive model outputs.