Related papers: How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference

How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference

URL: http://arxiv.org/abs/2010.03051v2
Date: Wed, 7 Jul 2021 17:19:30 GMT
Title: How and Why to Use Experimental Data to Evaluate Methods for Observational Causal Inference
Authors: Amanda Gentzel, Purva Pruthi, David Jensen
Abstract summary: We describe and analyze observational sampling from randomized controlled trials (OSRCT) This method can be used to create constructed observational data sets with corresponding unbiased estimates of treatment effect. We then perform a large-scale evaluation of seven causal inference methods over 37 data sets.
Score: 7.551130027327462
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Methods that infer causal dependence from observational data are central to many areas of science, including medicine, economics, and the social sciences. A variety of theoretical properties of these methods have been proven, but empirical evaluation remains a challenge, largely due to the lack of observational data sets for which treatment effect is known. We describe and analyze observational sampling from randomized controlled trials (OSRCT), a method for evaluating causal inference methods using data from randomized controlled trials (RCTs). This method can be used to create constructed observational data sets with corresponding unbiased estimates of treatment effect, substantially increasing the number of data sets available for empirical evaluation of causal inference methods. We show that, in expectation, OSRCT creates data sets that are equivalent to those produced by randomly sampling from empirical data sets in which all potential outcomes are available. We then perform a large-scale evaluation of seven causal inference methods over 37 data sets, drawn from RCTs, as well as simulators, real-world computational systems, and observational data sets augmented with a synthetic response variable. We find notable performance differences when comparing across data from different sources, demonstrating the importance of using data from a variety of sources when evaluating any causal inference method.

Related papers

Robust estimation of heterogeneous treatment effects in randomized trials leveraging external data [1.3124513975412255]
We propose the QR-learner, a model-agnostic learner that estimates conditional average treatment effects (CATE) within the trial population.<n>It has the potential to reduce the CATE prediction mean squared error while maintaining consistency, even when the external data is not aligned with the trial.<n>We apply the methods to a real-world dataset, demonstrating improvements in both CATE estimation and statistical power for detecting heterogeneous effects.
arXiv Detail & Related papers (2025-07-04T16:01:05Z)
An extensive simulation study evaluating the interaction of resampling techniques across multiple causal discovery contexts [2.0946534289186842]
We present theoretical results proving that certain resampling methods emulate the assignment of specific values to algorithm tuning parameters. We also report the results of extensive simulation experiments, which verify the theoretical result and provide substantial data.
arXiv Detail & Related papers (2025-03-19T17:18:18Z)
Combining Incomplete Observational and Randomized Data for Heterogeneous Treatment Effects [10.9134216137537]
Existing methods for integrating observational data with randomized data must require textitcomplete observational data. We propose a resilient approach to textbfCombine textbfIncomplete textbfObservational data and randomized data for HTE estimation.
arXiv Detail & Related papers (2024-10-28T06:19:14Z)
Approximating Counterfactual Bounds while Fusing Observational, Biased and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies. We show that the likelihood of the available data has no local maxima. We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z)
RCT Rejection Sampling for Causal Estimation Evaluation [25.845034753006367]
Confounding is a significant obstacle to unbiased estimation of causal effects from observational data. We build on a promising empirical evaluation strategy that simplifies evaluation design and uses real data. We show our algorithm indeed results in low bias when oracle estimators are evaluated on confounded samples.
arXiv Detail & Related papers (2023-07-27T20:11:07Z)
Data-Driven Estimation of Heterogeneous Treatment Effects [15.140272661540655]
Estimating how a treatment affects different individuals, known as heterogeneous treatment effect estimation, is an important problem in empirical sciences. We provide a survey of state-of-the-art data-driven methods for heterogeneous treatment effect estimation using machine learning.
arXiv Detail & Related papers (2023-01-16T21:36:49Z)
Combining Observational and Randomized Data for Estimating Heterogeneous Treatment Effects [82.20189909620899]
Estimating heterogeneous treatment effects is an important problem across many domains. Currently, most existing works rely exclusively on observational data. We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
arXiv Detail & Related papers (2022-02-25T18:59:54Z)
Evaluating Causal Inference Methods [0.4588028371034407]
We introduce a deep generative model-based framework, Credence, to validate causal inference methods. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods.
arXiv Detail & Related papers (2022-02-09T00:21:22Z)
Doing Great at Estimating CATE? On the Neglected Assumptions in Benchmark Comparisons of Treatment Effect Estimators [91.3755431537592]
We show that even in arguably the simplest setting, estimation under ignorability assumptions can be misleading. We consider two popular machine learning benchmark datasets for evaluation of heterogeneous treatment effect estimators. We highlight that the inherent characteristics of the benchmark datasets favor some algorithms over others.
arXiv Detail & Related papers (2021-07-28T13:21:27Z)
Efficient Causal Inference from Combined Observational and Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects. We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders. We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z)
Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials. We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
A Survey on Causal Inference [64.45536158710014]
Causal inference is a critical research topic across many domains, such as statistics, computer science, education, public policy and economics. Various causal effect estimation methods for observational data have sprung up.
arXiv Detail & Related papers (2020-02-05T21:35:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.