Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects
- URL: http://arxiv.org/abs/2202.12891v1
- Date: Fri, 25 Feb 2022 18:59:54 GMT
- Title: Combining Observational and Randomized Data for Estimating Heterogeneous
Treatment Effects
- Authors: Tobias Hatt, Jeroen Berrevoets, Alicia Curth, Stefan Feuerriegel,
Mihaela van der Schaar
- Abstract summary: Estimating heterogeneous treatment effects is an important problem across many domains.
Currently, most existing works rely exclusively on observational data.
We propose to estimate heterogeneous treatment effects by combining large amounts of observational data and small amounts of randomized data.
- Score: 82.20189909620899
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating heterogeneous treatment effects is an important problem across
many domains. In order to accurately estimate such treatment effects, one
typically relies on data from observational studies or randomized experiments.
Currently, most existing works rely exclusively on observational data, which is
often confounded and, hence, yields biased estimates. While observational data
is confounded, randomized data is unconfounded, but its sample size is usually
too small to learn heterogeneous treatment effects. In this paper, we propose
to estimate heterogeneous treatment effects by combining large amounts of
observational data and small amounts of randomized data via representation
learning. In particular, we introduce a two-step framework: first, we use
observational data to learn a shared structure (in form of a representation);
and then, we use randomized data to learn the data-specific structures. We
analyze the finite sample properties of our framework and compare them to
several natural baselines. As such, we derive conditions for when combining
observational and randomized data is beneficial, and for when it is not. Based
on this, we introduce a sample-efficient algorithm, called CorNet. We use
extensive simulation studies to verify the theoretical properties of CorNet and
multiple real-world datasets to demonstrate our method's superiority compared
to existing methods.
Related papers
- Combining Incomplete Observational and Randomized Data for Heterogeneous Treatment Effects [10.9134216137537]
Existing methods for integrating observational data with randomized data must require textitcomplete observational data.
We propose a resilient approach to textbfCombine textbfIncomplete textbfObservational data and randomized data for HTE estimation.
arXiv Detail & Related papers (2024-10-28T06:19:14Z) - Approximating Counterfactual Bounds while Fusing Observational, Biased
and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies.
We show that the likelihood of the available data has no local maxima.
We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z) - Learning to Bound Counterfactual Inference in Structural Causal Models
from Observational and Randomised Data [64.96984404868411]
We derive a likelihood characterisation for the overall data that leads us to extend a previous EM-based algorithm.
The new algorithm learns to approximate the (unidentifiability) region of model parameters from such mixed data sources.
It delivers interval approximations to counterfactual results, which collapse to points in the identifiable case.
arXiv Detail & Related papers (2022-12-06T12:42:11Z) - Detecting hidden confounding in observational data using multiple
environments [0.81585306387285]
We present a theory for testable conditional independencies that are only absent when there is hidden confounding.
In most cases, the proposed procedure correctly predicts the presence of hidden confounding.
arXiv Detail & Related papers (2022-05-27T12:20:09Z) - Equivariance Allows Handling Multiple Nuisance Variables When Analyzing
Pooled Neuroimaging Datasets [53.34152466646884]
In this paper, we show how bringing recent results on equivariant representation learning instantiated on structured spaces together with simple use of classical results on causal inference provides an effective practical solution.
We demonstrate how our model allows dealing with more than one nuisance variable under some assumptions and can enable analysis of pooled scientific datasets in scenarios that would otherwise entail removing a large portion of the samples.
arXiv Detail & Related papers (2022-03-29T04:54:06Z) - What can the millions of random treatments in nonexperimental data
reveal about causes? [0.0]
The article introduces one such model and a Bayesian approach to combine the $O(n2)$ pairwise observations typically available in nonexperimnetal data.
We demonstrate that the proposed approach recovers causal effects in common NSW samples, as well as in arbitrary subpopulations and an order-of-magnitude larger supersample.
arXiv Detail & Related papers (2021-05-03T20:13:34Z) - Multi-Source Causal Inference Using Control Variates [81.57072928775509]
We propose a general algorithm to estimate causal effects from emphmultiple data sources.
We show theoretically that this reduces the variance of the ATE estimate.
We apply this framework to inference from observational data under an outcome selection bias.
arXiv Detail & Related papers (2021-03-30T21:20:51Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - How and Why to Use Experimental Data to Evaluate Methods for
Observational Causal Inference [7.551130027327462]
We describe and analyze observational sampling from randomized controlled trials (OSRCT)
This method can be used to create constructed observational data sets with corresponding unbiased estimates of treatment effect.
We then perform a large-scale evaluation of seven causal inference methods over 37 data sets.
arXiv Detail & Related papers (2020-10-06T21:44:01Z) - Tell Me Something I Don't Know: Randomization Strategies for Iterative
Data Mining [0.6100370338020054]
We consider the problem of randomizing data so that previously discovered patterns or models are taken into account.
In this paper, we consider the problem of randomizing data so that previously discovered patterns or models are taken into account.
arXiv Detail & Related papers (2020-06-16T19:20:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.