Variable importance for causal forests: breaking down the heterogeneity
of treatment effects
- URL: http://arxiv.org/abs/2308.03369v1
- Date: Mon, 7 Aug 2023 07:43:42 GMT
- Title: Variable importance for causal forests: breaking down the heterogeneity
of treatment effects
- Authors: Cl\'ement B\'enard, Julie Josse (PREMEDICAL)
- Abstract summary: We develop a new importance variable algorithm for causal forests.
We show how to handle the forest retrain without a confounding variable.
Experiments on simulated, semi-synthetic, and real data show the good performance of our importance measure.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Causal random forests provide efficient estimates of heterogeneous treatment
effects. However, forest algorithms are also well-known for their black-box
nature, and therefore, do not characterize how input variables are involved in
treatment effect heterogeneity, which is a strong practical limitation. In this
article, we develop a new importance variable algorithm for causal forests, to
quantify the impact of each input on the heterogeneity of treatment effects.
The proposed approach is inspired from the drop and relearn principle, widely
used for regression problems. Importantly, we show how to handle the forest
retrain without a confounding variable. If the confounder is not involved in
the treatment effect heterogeneity, the local centering step enforces
consistency of the importance measure. Otherwise, when a confounder also
impacts heterogeneity, we introduce a corrective term in the retrained causal
forest to recover consistency. Additionally, experiments on simulated,
semi-synthetic, and real data show the good performance of our importance
measure, which outperforms competitors on several test cases. Experiments also
show that our approach can be efficiently extended to groups of variables,
providing key insights in practice.
Related papers
- A Bayesian Classification Trees Approach to Treatment Effect Variation with Noncompliance [0.5356944479760104]
Estimating varying treatment effects in randomized trials with noncompliance is inherently challenging.
Existing flexible machine learning methods are highly sensitive to the weak instruments problem.
We present a Bayesian Causal Forest model for binary response variables in scenarios with noncompliance.
arXiv Detail & Related papers (2024-08-14T18:33:55Z) - Why do Random Forests Work? Understanding Tree Ensembles as
Self-Regularizing Adaptive Smoothers [68.76846801719095]
We argue that the current high-level dichotomy into bias- and variance-reduction prevalent in statistics is insufficient to understand tree ensembles.
We show that forests can improve upon trees by three distinct mechanisms that are usually implicitly entangled.
arXiv Detail & Related papers (2024-02-02T15:36:43Z) - Theoretical and Empirical Advances in Forest Pruning [0.0]
We revisit forest pruning, an approach that aims to have the best of both worlds: the accuracy of regression forests and the interpretability of regression trees.
We prove the advantage of a Lasso-pruned forest over its unpruned counterpart under extremely weak assumptions.
We find that in the vast majority of scenarios tested, there is at least one forest-pruning method that yields equal or better accuracy than the original full forest.
arXiv Detail & Related papers (2024-01-10T20:02:47Z) - A Causal Framework for Decomposing Spurious Variations [68.12191782657437]
We develop tools for decomposing spurious variations in Markovian and Semi-Markovian models.
We prove the first results that allow a non-parametric decomposition of spurious effects.
The described approach has several applications, ranging from explainable and fair AI to questions in epidemiology and medicine.
arXiv Detail & Related papers (2023-06-08T09:40:28Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Hybrid Censored Quantile Regression Forest to Assess the Heterogeneous
Effects [4.194179127753325]
We develop a hybrid forest approach called Hybrid Censored Quantile Regression Forest (HCQRF) to assess the heterogeneous effects varying with high-dimensional variables.
We propose a variable importance decomposition to measure the impact of a variable on the treatment effect function.
arXiv Detail & Related papers (2022-12-12T03:01:36Z) - What Makes Forest-Based Heterogeneous Treatment Effect Estimators Work? [1.1050303097572156]
We show that both methods can be understood in terms of the same parameters and confounding assumptions under L2 loss.
In the randomized setting, both approaches performed akin to the new blended versions in a benchmark study.
arXiv Detail & Related papers (2022-06-21T12:45:07Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - On Inductive Biases for Heterogeneous Treatment Effect Estimation [91.3755431537592]
We investigate how to exploit structural similarities of an individual's potential outcomes (POs) under different treatments.
We compare three end-to-end learning strategies to overcome this problem.
arXiv Detail & Related papers (2021-06-07T16:30:46Z) - Learning Decomposed Representation for Counterfactual Inference [53.36586760485262]
The fundamental problem in treatment effect estimation from observational data is confounder identification and balancing.
Most of the previous methods realized confounder balancing by treating all observed pre-treatment variables as confounders, ignoring further identifying confounders and non-confounders.
We propose a synergistic learning framework to 1) identify confounders by learning representations of both confounders and non-confounders, 2) balance confounder with sample re-weighting technique, and simultaneously 3) estimate the treatment effect in observational studies via counterfactual inference.
arXiv Detail & Related papers (2020-06-12T09:50:42Z) - Estimating heterogeneous treatment effects with right-censored data via
causal survival forests [2.624902795082451]
We introduce causal survival forests, which can be used to estimate heterogeneous treatment effects in a survival and observational setting.
Our approach relies on estimating equations to robustly adjust for both censoring and selection effects under unconfoundedness.
arXiv Detail & Related papers (2020-01-27T16:22:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.