Interpretability of Statistical, Machine Learning, and Deep Learning Models for Landslide Susceptibility Mapping in Three Gorges Reservoir Area
- URL: http://arxiv.org/abs/2405.11762v2
- Date: Wed, 29 May 2024 13:02:11 GMT
- Title: Interpretability of Statistical, Machine Learning, and Deep Learning Models for Landslide Susceptibility Mapping in Three Gorges Reservoir Area
- Authors: Cheng Chen, Lei Fan,
- Abstract summary: Landslide susceptibility mapping (LSM) is crucial for identifying high-risk areas and informing prevention strategies.
This study investigates the interpretability of statistical, machine learning (ML), and deep learning (DL) models in predicting landslide susceptibility.
- Score: 4.314875317825748
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Landslide susceptibility mapping (LSM) is crucial for identifying high-risk areas and informing prevention strategies. This study investigates the interpretability of statistical, machine learning (ML), and deep learning (DL) models in predicting landslide susceptibility. This is achieved by incorporating various relevant interpretation methods and two types of input factors: a comprehensive set of 19 contributing factors that are statistically relevant to landslides, as well as a dedicated set of 9 triggering factors directly associated with triggering landslides. Given that model performance is a crucial metric in LSM, our investigations into interpretability naturally involve assessing and comparing LSM accuracy across different models considered. In our investigation, the convolutional neural network model achieved the highest accuracy (0.8447 with 19 factors; 0.8048 with 9 factors), while Extreme Gradient Boosting and Support Vector Machine also demonstrated strong predictive capabilities, outperforming conventional statistical models. These findings indicate that DL and sophisticated ML algorithms can effectively capture the complex relationships between input factors and landslide occurrence. However, the interpretability of predictions varied among different models, particularly when using the broader set of 19 contributing factors. Explanation methods like SHAP, LIME, and DeepLIFT also led to variations in interpretation results. Using a comprehensive set of 19 contributing factors improved prediction accuracy but introduced complexities and inconsistency in model interpretations. Focusing on a dedicated set of 9 triggering factors sacrificed some predictive power but enhanced interpretability, as evidenced by more consistent key factors identified across various models and alignment with the findings of field investigation reports....
Related papers
- Explainable AI Integrated Feature Engineering for Wildfire Prediction [1.7934287771173114]
We conducted a thorough assessment of various machine learning algorithms for both classification and regression tasks relevant to predicting wildfires.
For classifying different types or stages of wildfires, the XGBoost model outperformed others in terms of accuracy and robustness.
The Random Forest regression model showed superior results in predicting the extent of wildfire-affected areas.
arXiv Detail & Related papers (2024-04-01T21:12:44Z) - Decoding Susceptibility: Modeling Misbelief to Misinformation Through a Computational Approach [61.04606493712002]
Susceptibility to misinformation describes the degree of belief in unverifiable claims that is not observable.
Existing susceptibility studies heavily rely on self-reported beliefs.
We propose a computational approach to model users' latent susceptibility levels.
arXiv Detail & Related papers (2023-11-16T07:22:56Z) - Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data.
One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z) - Mind the instructions: a holistic evaluation of consistency and
interactions in prompt-based learning [14.569770617709073]
We present a detailed analysis of which design choices cause instabilities and inconsistencies in task predictions.
We show how spurious correlations between input distributions and labels form only a minor problem for prompted models.
We statistically analyse the results to show which factors are the most influential, interactive or stable.
arXiv Detail & Related papers (2023-10-20T13:25:24Z) - Selection of contributing factors for predicting landslide
susceptibility using machine learning and deep learning models [5.097453589594454]
Landslides are a common natural disaster that can cause casualties, property safety threats and economic losses.
It is important to understand or predict the probability of landslide occurrence at potentially risky sites.
In this study, the impact of the selection of contributing factors on the accuracy of landslide susceptibility predictions was investigated.
arXiv Detail & Related papers (2023-09-12T09:00:17Z) - A prediction and behavioural analysis of machine learning methods for
modelling travel mode choice [0.26249027950824505]
We conduct a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice.
Results indicate that the models with the highest disaggregate predictive performance provide poorer estimates of behavioural indicators and aggregate mode shares.
It is also observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.
arXiv Detail & Related papers (2023-01-11T11:10:32Z) - Estimate Deformation Capacity of Non-Ductile RC Shear Walls using
Explainable Boosting Machine [0.0]
This study aims to develop a fully explainable machine learning model to predict the deformation capacity of non-ductile reinforced concrete shear walls.
The proposed Explainable Boosting Machines (EBM)-based model is an interpretable, robust, naturally explainable glass-box model, yet provides high accuracy comparable to its black-box counterparts.
arXiv Detail & Related papers (2023-01-11T09:20:29Z) - Accuracy on the Line: On the Strong Correlation Between
Out-of-Distribution and In-Distribution Generalization [89.73665256847858]
We show that out-of-distribution performance is strongly correlated with in-distribution performance for a wide range of models and distribution shifts.
Specifically, we demonstrate strong correlations between in-distribution and out-of-distribution performance on variants of CIFAR-10 & ImageNet.
We also investigate cases where the correlation is weaker, for instance some synthetic distribution shifts from CIFAR-10-C and the tissue classification dataset Camelyon17-WILDS.
arXiv Detail & Related papers (2021-07-09T19:48:23Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Accurate and Robust Feature Importance Estimation under Distribution
Shifts [49.58991359544005]
PRoFILE is a novel feature importance estimation method.
We show significant improvements over state-of-the-art approaches, both in terms of fidelity and robustness.
arXiv Detail & Related papers (2020-09-30T05:29:01Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.