Quantifying Ignorance in Individual-Level Causal-Effect Estimates under
Hidden Confounding
- URL: http://arxiv.org/abs/2103.04850v1
- Date: Mon, 8 Mar 2021 15:58:06 GMT
- Title: Quantifying Ignorance in Individual-Level Causal-Effect Estimates under
Hidden Confounding
- Authors: Andrew Jesson, S\"oren Mindermann, Yarin Gal, Uri Shalit
- Abstract summary: We study the problem of learning conditional average treatment effects (CATE) from high-dimensional, observational data with unobserved confounders.
We present a new parametric interval estimator suited for high-dimensional data.
- Score: 38.09565581056218
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning conditional average treatment effects (CATE)
from high-dimensional, observational data with unobserved confounders.
Unobserved confounders introduce ignorance -- a level of unidentifiability --
about an individual's response to treatment by inducing bias in CATE estimates.
We present a new parametric interval estimator suited for high-dimensional
data, that estimates a range of possible CATE values when given a predefined
bound on the level of hidden confounding. Further, previous interval estimators
do not account for ignorance about the CATE stemming from samples that may be
underrepresented in the original study, or samples that violate the overlap
assumption. Our novel interval estimator also incorporates model uncertainty so
that practitioners can be made aware of out-of-distribution data. We prove that
our estimator converges to tight bounds on CATE when there may be unobserved
confounding, and assess it using semi-synthetic, high-dimensional datasets.
Related papers
- Risk and cross validation in ridge regression with correlated samples [72.59731158970894]
We provide training examples for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations.
We further extend our analysis to the case where the test point has non-trivial correlations with the training set, setting often encountered in time series forecasting.
We validate our theory across a variety of high dimensional data.
arXiv Detail & Related papers (2024-08-08T17:27:29Z) - Valid causal inference with unobserved confounding in high-dimensional
settings [0.0]
We show how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models.
We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small.
arXiv Detail & Related papers (2024-01-12T13:21:20Z) - Label Shift Estimators for Non-Ignorable Missing Data [2.605549784939959]
We consider the problem of estimating the mean of a random variable Y subject to non-ignorable missingness, i.e., where the missingness mechanism depends on Y.
We use our approach to estimate disease prevalence using a large health survey, comparing ignorable and non-ignorable approaches.
arXiv Detail & Related papers (2023-10-27T16:50:13Z) - Falsification before Extrapolation in Causal Effect Estimation [6.715453431174765]
Causal effects in populations are often estimated using observational datasets.
We propose a meta-algorithm that attempts to reject observational estimates that are biased.
arXiv Detail & Related papers (2022-09-27T21:47:23Z) - Deconfounding Temporal Autoencoder: Estimating Treatment Effects over
Time Using Noisy Proxies [15.733136147164032]
Estimating individualized treatment effects (ITEs) from observational data is crucial for decision-making.
We develop the Deconfounding Temporal Autoencoder, a novel method that leverages observed noisy proxies to learn a hidden embedding.
We demonstrate the effectiveness of our DTA by improving over state-of-the-art benchmarks by a substantial margin.
arXiv Detail & Related papers (2021-12-06T13:14:31Z) - Identifiable Energy-based Representations: An Application to Estimating
Heterogeneous Causal Effects [83.66276516095665]
Conditional average treatment effects (CATEs) allow us to understand the effect heterogeneity across a large population of individuals.
Typical CATE learners assume all confounding variables are measured in order for the CATE to be identifiable.
We propose an energy-based model (EBM) that learns a low-dimensional representation of the variables by employing a noise contrastive loss function.
arXiv Detail & Related papers (2021-08-06T10:39:49Z) - Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation.
We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation.
Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Performance metrics for intervention-triggering prediction models do not
reflect an expected reduction in outcomes from using the model [71.9860741092209]
Clinical researchers often select among and evaluate risk prediction models.
Standard metrics calculated from retrospective data are only related to model utility under certain assumptions.
When predictions are delivered repeatedly throughout time, the relationship between standard metrics and utility is further complicated.
arXiv Detail & Related papers (2020-06-02T16:26:49Z) - Causal Inference With Selectively Deconfounded Data [22.624714904663424]
We consider the benefit of incorporating a large confounded observational dataset (confounder unobserved) alongside a small deconfounded observational dataset (confounder revealed) when estimating the Average Treatment Effect (ATE)
Our theoretical results suggest that the inclusion of confounded data can significantly reduce the quantity of deconfounded data required to estimate the ATE to within a desired accuracy level.
arXiv Detail & Related papers (2020-02-25T18:46:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.