Related papers: Estimating Causal Effects with Double Machine Learning -- A Method Evaluation

Estimating Causal Effects with Double Machine Learning -- A Method Evaluation

URL: http://arxiv.org/abs/2403.14385v2
Date: Tue, 30 Apr 2024 10:42:42 GMT
Title: Estimating Causal Effects with Double Machine Learning -- A Method Evaluation
Authors: Jonathan Fuhr, Philipp Berens, Dominik Papies,
Abstract summary: We review one of the most prominent methods - "double/debiased machine learning" (DML) Our findings indicate that the application of a suitably flexible machine learning algorithm within DML improves the adjustment for various nonlinear confounding relationships. When estimating the effects of air pollution on housing prices, we find that DML estimates are consistently larger than estimates of less flexible methods.
Score: 5.904095466127043
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The estimation of causal effects with observational data continues to be a very active research area. In recent years, researchers have developed new frameworks which use machine learning to relax classical assumptions necessary for the estimation of causal effects. In this paper, we review one of the most prominent methods - "double/debiased machine learning" (DML) - and empirically evaluate it by comparing its performance on simulated data relative to more traditional statistical methods, before applying it to real-world data. Our findings indicate that the application of a suitably flexible machine learning algorithm within DML improves the adjustment for various nonlinear confounding relationships. This advantage enables a departure from traditional functional form assumptions typically necessary in causal effect estimation. However, we demonstrate that the method continues to critically depend on standard assumptions about causal structure and identification. When estimating the effects of air pollution on housing prices in our application, we find that DML estimates are consistently larger than estimates of less flexible methods. From our overall results, we provide actionable recommendations for specific choices researchers must make when applying DML in practice.

Related papers

CausalPFN: Amortized Causal Effect Estimation via In-Context Learning [15.645599403885605]
CausalPFN infers causal effects for new observational datasets out-of-the-box.<n>Our approach achieves superior average performance on heterogeneous and average treatment effect estimation benchmarks.<n>CausalPFN provides calibrated uncertainty estimates to support reliable decision-making based on Bayesian principles.
arXiv Detail & Related papers (2025-06-09T16:31:06Z)
Black Box Causal Inference: Effect Estimation via Meta Prediction [56.277798874118425]
We frame causal inference as a dataset-level prediction problem, offloading algorithm design to the learning process. We introduce, called black box causal inference (BBCI), builds estimators in a black-box manner by learning to predict causal effects from sampled dataset-effect pairs. We demonstrate accurate estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) with BBCI across several causal inference problems.
arXiv Detail & Related papers (2025-03-07T23:43:19Z)
C-XGBoost: A tree boosting model for causal effect estimation [8.246161706153805]
Causal effect estimation aims at estimating the Average Treatment Effect as well as the Conditional Average Treatment Effect of a treatment to an outcome from the available data. We propose a new causal inference model, named C-XGBoost, for the prediction of potential outcomes.
arXiv Detail & Related papers (2024-03-31T17:43:37Z)
A Double Machine Learning Approach to Combining Experimental and Observational Data [59.29868677652324]
We propose a double machine learning approach to combine experimental and observational studies. Our framework tests for violations of external validity and ignorability under milder assumptions.
arXiv Detail & Related papers (2023-07-04T02:53:11Z)
B-Learner: Quasi-Oracle Bounds on Heterogeneous Causal Effects Under Hidden Confounding [51.74479522965712]
We propose a meta-learner called the B-Learner, which can efficiently learn sharp bounds on the CATE function under limits on hidden confounding. We prove its estimates are valid, sharp, efficient, and have a quasi-oracle property with respect to the constituent estimators under more general conditions than existing methods.
arXiv Detail & Related papers (2023-04-20T18:07:19Z)
Hyperparameter Tuning and Model Evaluation in Causal Effect Estimation [2.7823528791601686]
This paper investigates the interplay between the four different aspects of model evaluation for causal effect estimation. We find that most causal estimators are roughly equivalent in performance if tuned thoroughly enough. We call for more research into causal model evaluation to unlock the optimum performance not currently being delivered even by state-of-the-art procedures.
arXiv Detail & Related papers (2023-03-02T17:03:02Z)
Data-Driven Estimation of Heterogeneous Treatment Effects [15.140272661540655]
Estimating how a treatment affects different individuals, known as heterogeneous treatment effect estimation, is an important problem in empirical sciences. We provide a survey of state-of-the-art data-driven methods for heterogeneous treatment effect estimation using machine learning.
arXiv Detail & Related papers (2023-01-16T21:36:49Z)
Systematic Evaluation of Predictive Fairness [60.0947291284978]
Mitigating bias in training on biased datasets is an important open problem. We examine the performance of various debiasing methods across multiple tasks. We find that data conditions have a strong influence on relative model performance.
arXiv Detail & Related papers (2022-10-17T05:40:13Z)
Robust Causal Learning for the Estimation of Average Treatment Effects [14.96459402684986]
We propose a Robust Causal Learning (RCL) method to offset the deficiencies of the Double/Debiased Machine Learning (DML) estimators. Empirically, the comprehensive experiments show that i) the RCL estimators give more stable estimations of the causal parameters than the DML estimators.
arXiv Detail & Related papers (2022-09-05T07:35:58Z)
An evaluation framework for comparing causal inference models [3.1372269816123994]
We use the proposed evaluation methodology to compare several state-of-the-art causal effect estimation models. The main motivation behind this approach is the elimination of the influence of a small number of instances or simulation on the benchmarking process.
arXiv Detail & Related papers (2022-08-31T21:04:20Z)
Counterfactual Maximum Likelihood Estimation for Training Deep Networks [83.44219640437657]
Deep learning models are prone to learning spurious correlations that should not be learned as predictive clues. We propose a causality-based training framework to reduce the spurious correlations caused by observable confounders. We conduct experiments on two real-world tasks: Natural Language Inference (NLI) and Image Captioning.
arXiv Detail & Related papers (2021-06-07T17:47:16Z)
Performance metrics for intervention-triggering prediction models do not reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models. Standard metrics calculated from retrospective data are only related to model utility under certain assumptions. When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z)
Localized Debiased Machine Learning: Efficient Inference on Quantile Treatment Effects and Beyond [69.83813153444115]
We consider an efficient estimating equation for the (local) quantile treatment effect ((L)QTE) in causal inference. Debiased machine learning (DML) is a data-splitting approach to estimating high-dimensional nuisances. We propose localized debiased machine learning (LDML), which avoids this burdensome step.
arXiv Detail & Related papers (2019-12-30T14:42:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.