Enhancing Variable Importance in Random Forests: A Novel Application of Global Sensitivity Analysis
- URL: http://arxiv.org/abs/2407.14194v1
- Date: Fri, 19 Jul 2024 10:45:36 GMT
- Title: Enhancing Variable Importance in Random Forests: A Novel Application of Global Sensitivity Analysis
- Authors: Giulia Vannucci, Roberta Siciliano, Andrea Saltelli,
- Abstract summary: The present work provides an application of Global Sensitivity Analysis to supervised machine learning methods such as Random Forests.
Global Sensitivity Analysis is primarily used in mathematical modelling to investigate the effect of the uncertainties of the input variables on the output.
A simulation study shows that our proposal can be used to explore what advances can be achieved either in terms of efficiency, explanatory ability, or simply by way of confirming existing results.
- Score: 0.9954382983583578
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The present work provides an application of Global Sensitivity Analysis to supervised machine learning methods such as Random Forests. These methods act as black boxes, selecting features in high--dimensional data sets as to provide accurate classifiers in terms of prediction when new data are fed into the system. In supervised machine learning, predictors are generally ranked by importance based on their contribution to the final prediction. Global Sensitivity Analysis is primarily used in mathematical modelling to investigate the effect of the uncertainties of the input variables on the output. We apply it here as a novel way to rank the input features by their importance to the explainability of the data generating process, shedding light on how the response is determined by the dependence structure of its predictors. A simulation study shows that our proposal can be used to explore what advances can be achieved either in terms of efficiency, explanatory ability, or simply by way of confirming existing results.
Related papers
- A Novel Hybrid Feature Importance and Feature Interaction Detection
Framework for Predictive Optimization in Industry 4.0 Applications [1.0870564199697297]
This paper proposes a novel hybrid framework that combines the feature importance detector - local interpretable model-agnostic explanations (LIME) and the feature interaction detector - neural interaction detection (NID)
The experimental outcomes reveal an augmentation of up to 9.56% in the R2 score, and a diminution of up to 24.05% in the root mean square error.
arXiv Detail & Related papers (2024-03-04T13:22:53Z) - Prospector Heads: Generalized Feature Attribution for Large Models & Data [82.02696069543454]
We introduce prospector heads, an efficient and interpretable alternative to explanation-based attribution methods.
We demonstrate how prospector heads enable improved interpretation and discovery of class-specific patterns in input data.
arXiv Detail & Related papers (2024-02-18T23:01:28Z) - Factor Importance Ranking and Selection using Total Indices [0.0]
A factor importance measure ought to characterize the feature's predictive potential without relying on a specific prediction algorithm.
We present the equivalence between predictiveness potential and total Sobol' indices from global sensitivity analysis.
We introduce a novel consistent estimator that can be directly estimated from noisy data.
arXiv Detail & Related papers (2024-01-01T16:02:06Z) - LaPLACE: Probabilistic Local Model-Agnostic Causal Explanations [1.0370398945228227]
We introduce LaPLACE-explainer, designed to provide probabilistic cause-and-effect explanations for machine learning models.
The LaPLACE-Explainer component leverages the concept of a Markov blanket to establish statistical boundaries between relevant and non-relevant features.
Our approach offers causal explanations and outperforms LIME and SHAP in terms of local accuracy and consistency of explained features.
arXiv Detail & Related papers (2023-10-01T04:09:59Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Generative Causal Representation Learning for Out-of-Distribution Motion
Forecasting [13.99348653165494]
We propose Generative Causal Learning Representation to facilitate knowledge transfer under distribution shifts.
While we evaluate the effectiveness of our proposed method in human trajectory prediction models, GCRL can be applied to other domains as well.
arXiv Detail & Related papers (2023-02-17T00:30:44Z) - Deep networks for system identification: a Survey [56.34005280792013]
System identification learns mathematical descriptions of dynamic systems from input-output data.
Main aim of the identified model is to predict new data from previous observations.
We discuss architectures commonly adopted in the literature, like feedforward, convolutional, and recurrent networks.
arXiv Detail & Related papers (2023-01-30T12:38:31Z) - Prediction-Powered Inference [68.97619568620709]
Prediction-powered inference is a framework for performing valid statistical inference when an experimental dataset is supplemented with predictions from a machine-learning system.
The framework yields simple algorithms for computing provably valid confidence intervals for quantities such as means, quantiles, and linear and logistic regression coefficients.
Prediction-powered inference could enable researchers to draw valid and more data-efficient conclusions using machine learning.
arXiv Detail & Related papers (2023-01-23T18:59:28Z) - What Should I Know? Using Meta-gradient Descent for Predictive Feature
Discovery in a Single Stream of Experience [63.75363908696257]
computational reinforcement learning seeks to construct an agent's perception of the world through predictions of future sensations.
An open challenge in this line of work is determining from the infinitely many predictions that the agent could possibly make which predictions might best support decision-making.
We introduce a meta-gradient descent process by which an agent learns what predictions to make, 2) the estimates for its chosen predictions, and 3) how to use those estimates to generate policies that maximize future reward.
arXiv Detail & Related papers (2022-06-13T21:31:06Z) - Preference Enhanced Social Influence Modeling for Network-Aware Cascade
Prediction [59.221668173521884]
We propose a novel framework to promote cascade size prediction by enhancing the user preference modeling.
Our end-to-end method makes the user activating process of information diffusion more adaptive and accurate.
arXiv Detail & Related papers (2022-04-18T09:25:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.