What can the millions of random treatments in nonexperimental data
reveal about causes?
- URL: http://arxiv.org/abs/2105.01152v1
- Date: Mon, 3 May 2021 20:13:34 GMT
- Title: What can the millions of random treatments in nonexperimental data
reveal about causes?
- Authors: Andre F. Ribeiro, Frank Neffke and Ricardo Hausmann
- Abstract summary: The article introduces one such model and a Bayesian approach to combine the $O(n2)$ pairwise observations typically available in nonexperimnetal data.
We demonstrate that the proposed approach recovers causal effects in common NSW samples, as well as in arbitrary subpopulations and an order-of-magnitude larger supersample.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: We propose a new method to estimate causal effects from nonexperimental data.
Each pair of sample units is first associated with a stochastic 'treatment' -
differences in factors between units - and an effect - a resultant outcome
difference. It is then proposed that all such pairs can be combined to provide
more accurate estimates of causal effects in observational data, provided a
statistical model connecting combinatorial properties of treatments to the
accuracy and unbiasedness of their effects. The article introduces one such
model and a Bayesian approach to combine the $O(n^2)$ pairwise observations
typically available in nonexperimnetal data. This also leads to an
interpretation of nonexperimental datasets as incomplete, or noisy, versions of
ideal factorial experimental designs.
This approach to causal effect estimation has several advantages: (1) it
expands the number of observations, converting thousands of individuals into
millions of observational treatments; (2) starting with treatments closest to
the experimental ideal, it identifies noncausal variables that can be ignored
in the future, making estimation easier in each subsequent iteration while
departing minimally from experiment-like conditions; (3) it recovers individual
causal effects in heterogeneous populations. We evaluate the method in
simulations and the National Supported Work (NSW) program, an intensively
studied program whose effects are known from randomized field experiments. We
demonstrate that the proposed approach recovers causal effects in common NSW
samples, as well as in arbitrary subpopulations and an order-of-magnitude
larger supersample with the entire national program data, outperforming
Statistical, Econometrics and Machine Learning estimators in all cases...
Related papers
- Causal Lifting of Neural Representations: Zero-Shot Generalization for Causal Inferences [56.23412698865433]
We focus on causal inferences on a target experiment with unlabeled factual outcomes, retrieved by a predictive model fine-tuned on a labeled similar experiment.
First, we show that factual outcome estimation via Empirical Risk Minimization (ERM) may fail to yield valid causal inferences on the target population.
We propose Deconfounded Empirical Risk Minimization (DERM), a new simple learning procedure minimizing the risk over a fictitious target population.
arXiv Detail & Related papers (2025-02-10T10:52:17Z) - Efficient Randomized Experiments Using Foundation Models [10.606998433337894]
In this paper, we propose a novel approach that integrates the predictions from multiple foundation models while preserving valid statistical inference.
Our estimator offers substantial precision gains, equivalent to a reduction of up to 20% in the sample size needed to match the same precision as the standard estimator based on experimental data alone.
arXiv Detail & Related papers (2025-02-06T17:54:10Z) - Multi-CATE: Multi-Accurate Conditional Average Treatment Effect Estimation Robust to Unknown Covariate Shifts [12.289361708127876]
We use methodology for learning multi-accurate predictors to post-process CATE T-learners.
We show how this approach can combine (large) confounded observational and (smaller) randomized datasets.
arXiv Detail & Related papers (2024-05-28T14:12:25Z) - Identification of Single-Treatment Effects in Factorial Experiments [0.0]
I show that when multiple interventions are randomized in experiments, the effect any single intervention would have outside the experimental setting is not identified absent heroic assumptions.
observational studies and factorial experiments provide information about potential-outcome distributions with zero and multiple interventions.
I show that researchers who rely on this type of design have to justify either linearity of functional forms or specify with Directed Acyclic Graphs how variables are related in the real world.
arXiv Detail & Related papers (2024-05-16T04:01:53Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects [82.20189909620899]
Estimating heterogeneous treatment effects is an important problem across many domains.
Currently, most existing works rely exclusively on observational data.
We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
arXiv Detail & Related papers (2022-02-25T18:59:54Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Enabling Counterfactual Survival Analysis with Balanced Representations [64.17342727357618]
Survival data are frequently encountered across diverse medical applications, i.e., drug development, risk profiling, and clinical trials.
We propose a theoretically grounded unified framework for counterfactual inference applicable to survival outcomes.
arXiv Detail & Related papers (2020-06-14T01:15:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.