Challenges in Variable Importance Ranking Under Correlation
- URL: http://arxiv.org/abs/2402.03447v1
- Date: Mon, 5 Feb 2024 19:02:13 GMT
- Title: Challenges in Variable Importance Ranking Under Correlation
- Authors: Annie Liang and Thomas Jemielita and Andy Liaw and Vladimir Svetnik
and Lingkang Huang and Richard Baumgartner and Jason M. Klusowski
- Abstract summary: We present a comprehensive simulation study investigating the impact of feature correlation on the assessment of variable importance.
While there is always no correlation between knockoff variables and its corresponding predictor variables, we prove that the correlation increases linearly beyond a certain correlation threshold between the predictor variables.
- Score: 6.718144470265263
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Variable importance plays a pivotal role in interpretable machine learning as
it helps measure the impact of factors on the output of the prediction model.
Model agnostic methods based on the generation of "null" features via
permutation (or related approaches) can be applied. Such analysis is often
utilized in pharmaceutical applications due to its ability to interpret
black-box models, including tree-based ensembles. A major challenge and
significant confounder in variable importance estimation however is the
presence of between-feature correlation. Recently, several adjustments to
marginal permutation utilizing feature knockoffs were proposed to address this
issue, such as the variable importance measure known as conditional predictive
impact (CPI). Assessment and evaluation of such approaches is the focus of our
work. We first present a comprehensive simulation study investigating the
impact of feature correlation on the assessment of variable importance. We then
theoretically prove the limitation that highly correlated features pose for the
CPI through the knockoff construction. While we expect that there is always no
correlation between knockoff variables and its corresponding predictor
variables, we prove that the correlation increases linearly beyond a certain
correlation threshold between the predictor variables. Our findings emphasize
the absence of free lunch when dealing with high feature correlation, as well
as the necessity of understanding the utility and limitations behind methods in
variable importance estimation.
Related papers
- Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables [13.12743473333296]
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science.
We propose a novel local learning approach for covariate selection in nonparametric causal effect estimation.
We validate our algorithm through extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-11-25T12:08:54Z) - Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse
Actions, Interventions and Sparse Temporal Dependencies [58.179981892921056]
This work introduces a novel principle for disentanglement we call mechanism sparsity regularization.
We propose a representation learning method that induces disentanglement by simultaneously learning the latent factors.
We show that the latent factors can be recovered by regularizing the learned causal graph to be sparse.
arXiv Detail & Related papers (2024-01-10T02:38:21Z) - Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data.
One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z) - A Notion of Feature Importance by Decorrelation and Detection of Trends
by Random Forest Regression [1.675857332621569]
We introduce a novel notion of feature importance based on the well-studied Gram-Schmidt decorrelation method.
We propose two estimators for identifying trends in the data using random forest regression.
arXiv Detail & Related papers (2023-03-02T11:01:49Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Decorrelated Variable Importance [0.0]
We propose a method for mitigating the effect of correlation by defining a modified version of LOCO.
This new parameter is difficult to estimate nonparametrically, but we show how to estimate it using semiparametric models.
arXiv Detail & Related papers (2021-11-21T16:31:36Z) - Variational Causal Networks: Approximate Bayesian Inference over Causal
Structures [132.74509389517203]
We introduce a parametric variational family modelled by an autoregressive distribution over the space of discrete DAGs.
In experiments, we demonstrate that the proposed variational posterior is able to provide a good approximation of the true posterior.
arXiv Detail & Related papers (2021-06-14T17:52:49Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Estimating Causal Effects with the Neural Autoregressive Density
Estimator [6.59529078336196]
We use neural autoregressive density estimators to estimate causal effects within the Pearl's do-calculus framework.
We show that the approach can retrieve causal effects from non-linear systems without explicitly modeling the interactions between the variables.
arXiv Detail & Related papers (2020-08-17T13:12:38Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.