RCT Rejection Sampling for Causal Estimation Evaluation
- URL: http://arxiv.org/abs/2307.15176v3
- Date: Wed, 31 Jan 2024 12:40:40 GMT
- Title: RCT Rejection Sampling for Causal Estimation Evaluation
- Authors: Katherine A. Keith, Sergey Feldman, David Jurgens, Jonathan Bragg,
Rohit Bhattacharya
- Abstract summary: Confounding is a significant obstacle to unbiased estimation of causal effects from observational data.
We build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data.
We show our algorithm indeed results in low bias when oracle estimators are evaluated on confounded samples.
- Score: 25.845034753006367
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Confounding is a significant obstacle to unbiased estimation of causal
effects from observational data. For settings with high-dimensional covariates
-- such as text data, genomics, or the behavioral social sciences --
researchers have proposed methods to adjust for confounding by adapting machine
learning methods to the goal of causal estimation. However, empirical
evaluation of these adjustment methods has been challenging and limited. In
this work, we build on a promising empirical evaluation strategy that
simplifies evaluation design and uses real data: subsampling randomized
controlled trials (RCTs) to create confounded observational datasets while
using the average causal effects from the RCTs as ground-truth. We contribute a
new sampling algorithm, which we call RCT rejection sampling, and provide
theoretical guarantees that causal identification holds in the observational
data to allow for valid comparisons to the ground-truth RCT. Using synthetic
data, we show our algorithm indeed results in low bias when oracle estimators
are evaluated on the confounded samples, which is not always the case for a
previously proposed algorithm. In addition to this identification result, we
highlight several finite data considerations for evaluation designers who plan
to use RCT rejection sampling on their own datasets. As a proof of concept, we
implement an example evaluation pipeline and walk through these finite data
considerations with a novel, real-world RCT -- which we release publicly --
consisting of approximately 70k observations and text data as high-dimensional
covariates. Together, these contributions build towards a broader agenda of
improved empirical evaluation for causal estimation.
Related papers
- Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - Improved Policy Evaluation for Randomized Trials of Algorithmic Resource
Allocation [54.72195809248172]
We present a new estimator leveraging our proposed novel concept, that involves retrospective reshuffling of participants across experimental arms at the end of an RCT.
We prove theoretically that such an estimator is more accurate than common estimators based on sample means.
arXiv Detail & Related papers (2023-02-06T05:17:22Z) - Falsification before Extrapolation in Causal Effect Estimation [6.715453431174765]
Causal effects in populations are often estimated using observational datasets.
We propose a meta-algorithm that attempts to reject observational estimates that are biased.
arXiv Detail & Related papers (2022-09-27T21:47:23Z) - Evaluating Causal Inference Methods [0.4588028371034407]
We introduce a deep generative model-based framework, Credence, to validate causal inference methods.
Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods.
arXiv Detail & Related papers (2022-02-09T00:21:22Z) - Assessment of Treatment Effect Estimators for Heavy-Tailed Data [70.72363097550483]
A central obstacle in the objective assessment of treatment effect (TE) estimators in randomized control trials (RCTs) is the lack of ground truth (or validation set) to test their performance.
We provide a novel cross-validation-like methodology to address this challenge.
We evaluate our methodology across 709 RCTs implemented in the Amazon supply chain.
arXiv Detail & Related papers (2021-12-14T17:53:01Z) - Learning Bias-Invariant Representation by Cross-Sample Mutual
Information Minimization [77.8735802150511]
We propose a cross-sample adversarial debiasing (CSAD) method to remove the bias information misused by the target task.
The correlation measurement plays a critical role in adversarial debiasing and is conducted by a cross-sample neural mutual information estimator.
We conduct thorough experiments on publicly available datasets to validate the advantages of the proposed method over state-of-the-art approaches.
arXiv Detail & Related papers (2021-08-11T21:17:02Z) - Doing Great at Estimating CATE? On the Neglected Assumptions in
Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading.
We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators.
We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z) - How and Why to Use Experimental Data to Evaluate Methods for
Observational Causal Inference [7.551130027327462]
We describe and analyze observational sampling from randomized controlled trials (OSRCT)
This method can be used to create constructed observational data sets with corresponding unbiased estimates of treatment effect.
We then perform a large-scale evaluation of seven causal inference methods over 37 data sets.
arXiv Detail & Related papers (2020-10-06T21:44:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.