Explainable AI Integrated Feature Selection for Landslide Susceptibility
Mapping using TreeSHAP
- URL: http://arxiv.org/abs/2201.03225v2
- Date: Tue, 27 Jun 2023 02:36:46 GMT
- Title: Explainable AI Integrated Feature Selection for Landslide Susceptibility
Mapping using TreeSHAP
- Authors: Muhammad Sakib Khan Inan and Istiakur Rahman
- Abstract summary: An early prediction of landslide susceptibility using a data-driven approach is a demand of time.
We employed state-of-the-art machine learning algorithms including XgBoost, LR, KNN, SVM, and Adaboost for landslide susceptibility prediction.
An optimized version of XgBoost along with feature reduction by 40 % has outperformed all other classifiers in terms of popular evaluation metrics.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Landslides have been a regular occurrence and an alarming threat to human
life and property in the era of anthropogenic global warming. An early
prediction of landslide susceptibility using a data-driven approach is a demand
of time. In this study, we explored the eloquent features that best describe
landslide susceptibility with state-of-the-art machine learning methods. In our
study, we employed state-of-the-art machine learning algorithms including
XgBoost, LR, KNN, SVM, and Adaboost for landslide susceptibility prediction. To
find the best hyperparameters of each individual classifier for optimized
performance, we have incorporated the Grid Search method, with 10 Fold
Cross-Validation. In this context, the optimized version of XgBoost
outperformed all other classifiers with a Cross-validation Weighted F1 score of
94.62 %. Followed by this empirical evidence, we explored the XgBoost
classifier by incorporating TreeSHAP, a game-theory-based statistical algorithm
used to explain Machine Learning models, to identify eloquent features such as
SLOPE, ELEVATION, TWI that complement the performance of the XGBoost classifier
mostly and features such as LANDUSE, NDVI, SPI which has less effect on models
performance. According to the TreeSHAP explanation of features, we selected the
9 most significant landslide causal factors out of 15. Evidently, an optimized
version of XgBoost along with feature reduction by 40 % has outperformed all
other classifiers in terms of popular evaluation metrics with a
Cross-Validation Weighted F1 score of 95.01 % on the training and AUC score of
97 %
Related papers
- Improved Adaboost Algorithm for Web Advertisement Click Prediction Based on Long Short-Term Memory Networks [2.7959678888027906]
This paper explores an improved Adaboost algorithm based on Long Short-Term Memory Networks (LSTM)
By comparing it with several common machine learning algorithms, the paper analyses the advantages of the new model in ad click prediction.
It is shown that the improved algorithm proposed in this paper performs well in user ad click prediction with an accuracy of 92%.
arXiv Detail & Related papers (2024-08-08T03:27:02Z) - Zero-Inflated Tweedie Boosted Trees with CatBoost for Insurance Loss Analytics [0.8287206589886881]
We modify the Tweedie regression model to address its limitations in modeling aggregate claims for various types of insurance.
Our recommended approach involves a refined modeling of the zero-claim process, together with the integration of boosting methods.
Our modeling results reveal a marked improvement in model performance, showcasing its potential to deliver more accurate predictions.
arXiv Detail & Related papers (2024-06-23T20:03:55Z) - Predictive Analytics of Varieties of Potatoes [2.336821989135698]
We explore the application of machine learning algorithms specifically to enhance the selection process of Russet potato clones in breeding trials.
This study addresses the challenge of efficiently identifying high-yield, disease-resistant, and climate-resilient potato varieties.
arXiv Detail & Related papers (2024-04-04T00:49:05Z) - SMaRt: Improving GANs with Score Matching Regularity [94.81046452865583]
Generative adversarial networks (GANs) usually struggle in learning from highly diverse data, whose underlying manifold is complex.
We show that score matching serves as a promising solution to this issue thanks to its capability of persistently pushing the generated data points towards the real data manifold.
We propose to improve the optimization of GANs with score matching regularity (SMaRt)
arXiv Detail & Related papers (2023-11-30T03:05:14Z) - Robust Learning with Progressive Data Expansion Against Spurious
Correlation [65.83104529677234]
We study the learning process of a two-layer nonlinear convolutional neural network in the presence of spurious features.
Our analysis suggests that imbalanced data groups and easily learnable spurious features can lead to the dominance of spurious features during the learning process.
We propose a new training algorithm called PDE that efficiently enhances the model's robustness for a better worst-group performance.
arXiv Detail & Related papers (2023-06-08T05:44:06Z) - Boosting Differentiable Causal Discovery via Adaptive Sample Reweighting [62.23057729112182]
Differentiable score-based causal discovery methods learn a directed acyclic graph from observational data.
We propose a model-agnostic framework to boost causal discovery performance by dynamically learning the adaptive weights for the Reweighted Score function, ReScore.
arXiv Detail & Related papers (2023-03-06T14:49:59Z) - An efficient hybrid classification approach for COVID-19 based on Harris
Hawks Optimization and Salp Swarm Optimization [0.0]
This study presents a hybrid binary version of the Harris Hawks Optimization algorithm (HHO) and Salp Swarm Optimization (SSA) for Covid-19 classification.
The proposed algorithm (HHOSSA) achieved 96% accuracy with the SVM, 98% and 98% accuracy with two classifiers.
arXiv Detail & Related papers (2022-12-25T19:52:18Z) - FOSTER: Feature Boosting and Compression for Class-Incremental Learning [52.603520403933985]
Deep neural networks suffer from catastrophic forgetting when learning new categories.
We propose a novel two-stage learning paradigm FOSTER, empowering the model to learn new categories adaptively.
arXiv Detail & Related papers (2022-04-10T11:38:33Z) - MIO : Mutual Information Optimization using Self-Supervised Binary
Contrastive Learning [19.5917119072985]
We model contrastive learning into a binary classification problem to predict if a pair is positive or not.
The proposed method outperforms the state-of-the-art algorithms on benchmark datasets like STL-10, CIFAR-10, CIFAR-100.
arXiv Detail & Related papers (2021-11-24T17:51:29Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.