Feature Selection for Discovering Distributional Treatment Effect
Modifiers
- URL: http://arxiv.org/abs/2206.00516v1
- Date: Wed, 1 Jun 2022 14:25:32 GMT
- Title: Feature Selection for Discovering Distributional Treatment Effect
Modifiers
- Authors: Yoichi Chikahara, Makoto Yamada, Hisashi Kashima
- Abstract summary: We propose a framework for finding features relevant to the difference in treatment effects.
We derive a feature importance measure that quantifies how strongly the feature attributes influence the discrepancy between potential outcome distributions.
We then develop a feature selection algorithm that can control the type I error rate to the desired level.
- Score: 37.09619678733784
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding the features relevant to the difference in treatment effects is
essential to unveil the underlying causal mechanisms. Existing methods seek
such features by measuring how greatly the feature attributes affect the degree
of the {\it conditional average treatment effect} (CATE). However, these
methods may overlook important features because CATE, a measure of the average
treatment effect, cannot detect differences in distribution parameters other
than the mean (e.g., variance). To resolve this weakness of existing methods,
we propose a feature selection framework for discovering {\it distributional
treatment effect modifiers}. We first formulate a feature importance measure
that quantifies how strongly the feature attributes influence the discrepancy
between potential outcome distributions. Then we derive its computationally
efficient estimator and develop a feature selection algorithm that can control
the type I error rate to the desired level. Experimental results show that our
framework successfully discovers important features and outperforms the
existing mean-based method.
Related papers
- Causal Feature Selection via Transfer Entropy [59.999594949050596]
Causal discovery aims to identify causal relationships between features with observational data.
We introduce a new causal feature selection approach that relies on the forward and backward feature selection procedures.
We provide theoretical guarantees on the regression and classification errors for both the exact and the finite-sample cases.
arXiv Detail & Related papers (2023-10-17T08:04:45Z) - Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment
Effect Estimation [137.3520153445413]
A notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference.
We evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets.
The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes.
arXiv Detail & Related papers (2023-07-11T02:58:10Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Benchmarking Heterogeneous Treatment Effect Models through the Lens of
Interpretability [82.29775890542967]
Estimating personalized effects of treatments is a complex, yet pervasive problem.
Recent developments in the machine learning literature on heterogeneous treatment effect estimation gave rise to many sophisticated, but opaque, tools.
We use post-hoc feature importance methods to identify features that influence the model's predictions.
arXiv Detail & Related papers (2022-06-16T17:59:05Z) - Neuroevolutionary Feature Representations for Causal Inference [0.0]
We propose a novel approach for learning feature representations to aid the estimation of the conditional average treatment effect or CATE.
Our method focuses on an intermediate layer in a neural network trained to predict the outcome from the features.
arXiv Detail & Related papers (2022-05-21T09:13:04Z) - Learning Infomax and Domain-Independent Representations for Causal
Effect Inference with Real-World Data [9.601837205635686]
We learn the Infomax and Domain-Independent Representations to solve the above puzzles.
We show that our method achieves state-of-the-art performance on causal effect inference.
arXiv Detail & Related papers (2022-02-22T13:35:15Z) - To Impute or not to Impute? -- Missing Data in Treatment Effect
Estimation [84.76186111434818]
We identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection.
We show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates.
Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not.
arXiv Detail & Related papers (2022-02-04T12:08:31Z) - Comparing interpretability and explainability for feature selection [0.6015898117103068]
We investigate the performance of variable importance as a feature selection method across various black-box and interpretable machine learning methods.
The results show that regardless of whether we use the native variable importance method or SHAP, XGBoost fails to clearly distinguish between relevant and irrelevant features.
arXiv Detail & Related papers (2021-05-11T20:01:23Z) - A Class of Algorithms for General Instrumental Variable Models [29.558215059892206]
Causal treatment effect estimation is a key problem that arises in a variety of real-world settings.
We provide a method for causal effect bounding in continuous distributions.
arXiv Detail & Related papers (2020-06-11T12:32:24Z) - Nonparametric Feature Impact and Importance [0.6123324869194193]
We give mathematical definitions of feature impact and importance, derived from partial dependence curves, that operate directly on the data.
To assess quality, we show that features ranked by these definitions are competitive with existing feature selection techniques.
arXiv Detail & Related papers (2020-06-08T17:07:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.