Selection of contributing factors for predicting landslide
susceptibility using machine learning and deep learning models
- URL: http://arxiv.org/abs/2309.06062v2
- Date: Wed, 13 Sep 2023 01:03:30 GMT
- Title: Selection of contributing factors for predicting landslide
susceptibility using machine learning and deep learning models
- Authors: Cheng Chen and Lei Fan
- Abstract summary: Landslides are a common natural disaster that can cause casualties, property safety threats and economic losses.
It is important to understand or predict the probability of landslide occurrence at potentially risky sites.
In this study, the impact of the selection of contributing factors on the accuracy of landslide susceptibility predictions was investigated.
- Score: 5.097453589594454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Landslides are a common natural disaster that can cause casualties, property
safety threats and economic losses. Therefore, it is important to understand or
predict the probability of landslide occurrence at potentially risky sites. A
commonly used means is to carry out a landslide susceptibility assessment based
on a landslide inventory and a set of landslide contributing factors. This can
be readily achieved using machine learning (ML) models such as logistic
regression (LR), support vector machine (SVM), random forest (RF), extreme
gradient boosting (Xgboost), or deep learning (DL) models such as convolutional
neural network (CNN) and long short time memory (LSTM). As the input data for
these models, landslide contributing factors have varying influences on
landslide occurrence. Therefore, it is logically feasible to select more
important contributing factors and eliminate less relevant ones, with the aim
of increasing the prediction accuracy of these models. However, selecting more
important factors is still a challenging task and there is no generally
accepted method. Furthermore, the effects of factor selection using various
methods on the prediction accuracy of ML and DL models are unclear. In this
study, the impact of the selection of contributing factors on the accuracy of
landslide susceptibility predictions using ML and DL models was investigated.
Four methods for selecting contributing factors were considered for all the
aforementioned ML and DL models, which included Information Gain Ratio (IGR),
Recursive Feature Elimination (RFE), Particle Swarm Optimization (PSO), Least
Absolute Shrinkage and Selection Operators (LASSO) and Harris Hawk Optimization
(HHO). In addition, autoencoder-based factor selection methods for DL models
were also investigated. To assess their performances, an exhaustive approach
was adopted,...
Related papers
- Low-rank finetuning for LLMs: A fairness perspective [54.13240282850982]
Low-rank approximation techniques have become the de facto standard for fine-tuning Large Language Models.
This paper investigates the effectiveness of these methods in capturing the shift of fine-tuning datasets from the initial pre-trained data distribution.
We show that low-rank fine-tuning inadvertently preserves undesirable biases and toxic behaviors.
arXiv Detail & Related papers (2024-05-28T20:43:53Z) - Interpretability of Statistical, Machine Learning, and Deep Learning Models for Landslide Susceptibility Mapping in Three Gorges Reservoir Area [4.314875317825748]
Landslide susceptibility mapping (LSM) is crucial for identifying high-risk areas and informing prevention strategies.
This study investigates the interpretability of statistical, machine learning (ML), and deep learning (DL) models in predicting landslide susceptibility.
arXiv Detail & Related papers (2024-05-20T03:46:42Z) - Querying Easily Flip-flopped Samples for Deep Active Learning [63.62397322172216]
Active learning is a machine learning paradigm that aims to improve the performance of a model by strategically selecting and querying unlabeled data.
One effective selection strategy is to base it on the model's predictive uncertainty, which can be interpreted as a measure of how informative a sample is.
This paper proposes the it least disagree metric (LDM) as the smallest probability of disagreement of the predicted label.
arXiv Detail & Related papers (2024-01-18T08:12:23Z) - Variance of ML-based software fault predictors: are we really improving
fault prediction? [0.3222802562733786]
We experimentally analyze the variance of a state-of-the-art fault prediction approach.
We observed a maximum variance of 10.10% in terms of the per-class accuracy metric.
arXiv Detail & Related papers (2023-10-26T09:31:32Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Generative Causal Representation Learning for Out-of-Distribution Motion
Forecasting [13.99348653165494]
We propose Generative Causal Learning Representation to facilitate knowledge transfer under distribution shifts.
While we evaluate the effectiveness of our proposed method in human trajectory prediction models, GCRL can be applied to other domains as well.
arXiv Detail & Related papers (2023-02-17T00:30:44Z) - A prediction and behavioural analysis of machine learning methods for
modelling travel mode choice [0.26249027950824505]
We conduct a systematic comparison of different modelling approaches, across multiple modelling problems, in terms of the key factors likely to affect model choice.
Results indicate that the models with the highest disaggregate predictive performance provide poorer estimates of behavioural indicators and aggregate mode shares.
It is also observed that the MNL model performs robustly in a variety of situations, though ML techniques can improve the estimates of behavioural indices such as Willingness to Pay.
arXiv Detail & Related papers (2023-01-11T11:10:32Z) - Measuring Causal Effects of Data Statistics on Language Model's
`Factual' Predictions [59.284907093349425]
Large amounts of training data are one of the major reasons for the high performance of state-of-the-art NLP models.
We provide a language for describing how training data influences predictions, through a causal framework.
Our framework bypasses the need to retrain expensive models and allows us to estimate causal effects based on observational data alone.
arXiv Detail & Related papers (2022-07-28T17:36:24Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Influence Functions in Deep Learning Are Fragile [52.31375893260445]
influence functions approximate the effect of samples in test-time predictions.
influence estimates are fairly accurate for shallow networks.
Hessian regularization is important to get highquality influence estimates.
arXiv Detail & Related papers (2020-06-25T18:25:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.