Related papers: The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation

The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation

URL: http://arxiv.org/abs/2509.11381v1
Date: Sun, 14 Sep 2025 18:29:45 GMT
Title: The Honest Truth About Causal Trees: Accuracy Limits for Heterogeneous Treatment Effect Estimation
Authors: Matias D. Cattaneo, Jason M. Klusowski, Ruiqi Rae Yu,
Abstract summary: "Honest" causal decision tree estimators have become the standard in both academia and industry.<n>We demonstrate that these popular heterogeneous treatment effect estimators cannot achieve a error-in-$n$ convergence rate under basic conditions.<n>As a result, these commonly used estimators can exhibit poor performance in practice, and even be inconsistent in some settings.
Score: 7.404568009919416
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recursive decision trees have emerged as a leading methodology for heterogeneous causal treatment effect estimation and inference in experimental and observational settings. These procedures are fitted using the celebrated CART (Classification And Regression Tree) algorithm [Breiman et al., 1984], or custom variants thereof, and hence are believed to be "adaptive" to high-dimensional data, sparsity, or other specific features of the underlying data generating process. Athey and Imbens [2016] proposed several "honest" causal decision tree estimators, which have become the standard in both academia and industry. We study their estimators, and variants thereof, and establish lower bounds on their estimation error. We demonstrate that these popular heterogeneous treatment effect estimators cannot achieve a polynomial-in-$n$ convergence rate under basic conditions, where $n$ denotes the sample size. Contrary to common belief, honesty does not resolve these limitations and at best delivers negligible logarithmic improvements in sample size or dimension. As a result, these commonly used estimators can exhibit poor performance in practice, and even be inconsistent in some settings. Our theoretical insights are empirically validated through simulations.

Related papers

Orthogonal Causal Calibration [55.28164682911196]
We develop general algorithms for reducing the task of causal calibration to that of calibrating a standard (non-causal) predictive model.<n>Our results are exceedingly general, showing that essentially any existing calibration algorithm can be used in causal settings.
arXiv Detail & Related papers (2024-06-04T03:35:25Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding. We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z)
On Variance Estimation of Random Forests [0.0]
This paper develops an unbiased variance estimator based on incomplete U-statistics. We show that our estimators enjoy lower bias and more accurate confidence interval coverage without additional computational costs.
arXiv Detail & Related papers (2022-02-18T03:35:47Z)
Evaluating Causal Inference Methods [0.4588028371034407]
We introduce a deep generative model-based framework, Credence, to validate causal inference methods. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods.
arXiv Detail & Related papers (2022-02-09T00:21:22Z)
Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models. In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints. A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z)
A cautionary tale on fitting decision trees to data from additive models: generalization lower bounds [9.546094657606178]
We study the generalization performance of decision trees with respect to different generative regression models. This allows us to elicit their inductive bias, that is, the assumptions the algorithms make (or do not make) to generalize to new data. We prove a sharp squared error generalization lower bound for a large class of decision tree algorithms fitted to sparse additive models.
arXiv Detail & Related papers (2021-10-18T21:22:40Z)
Near-optimal inference in adaptive linear regression [60.08422051718195]
Even simple methods like least squares can exhibit non-normal behavior when data is collected in an adaptive manner. We propose a family of online debiasing estimators to correct these distributional anomalies in at least squares estimation. We demonstrate the usefulness of our theory via applications to multi-armed bandit, autoregressive time series estimation, and active learning with exploration.
arXiv Detail & Related papers (2021-07-05T21:05:11Z)
Discovering Heterogeneous Treatment Effects in Regression Discontinuity Designs [0.0]
The paper proposes a causal supervised machine learning algorithm to uncover treatment effect heterogeneity in fuzzy regression discontinuity designs.<n>We study the performance of the method through Monte Carlo simulations and apply it to uncover various sources of heterogeneity in the impact of attending a better secondary school in Romania.
arXiv Detail & Related papers (2021-06-22T09:47:28Z)
Deconfounding Scores: Feature Representations for Causal Effect Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation. We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data. In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z)
Sparse Bayesian Causal Forests for Heterogeneous Treatment Effects Estimation [0.0]
This paper develops a sparsity-inducing version of Bayesian Causal Forests. It is designed to estimate heterogeneous treatment effects using observational data.
arXiv Detail & Related papers (2021-02-12T15:24:50Z)
Causal Expectation-Maximisation [70.45873402967297]
We show that causal inference is NP-hard even in models characterised by polytree-shaped graphs. We introduce the causal EM algorithm to reconstruct the uncertainty about the latent variables from data about categorical manifest variables. We argue that there appears to be an unnoticed limitation to the trending idea that counterfactual bounds can often be computed without knowledge of the structural equations.
arXiv Detail & Related papers (2020-11-04T10:25:13Z)
On Low-rank Trace Regression under General Sampling Distribution [9.699586426043885]
We show that cross-validated estimators satisfy near-optimal error bounds on general assumptions. We also show that the cross-validated estimator outperforms the theory-inspired approach of selecting the parameter.
arXiv Detail & Related papers (2019-04-18T02:56:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.