Valid causal inference with unobserved confounding in high-dimensional
settings
- URL: http://arxiv.org/abs/2401.06564v1
- Date: Fri, 12 Jan 2024 13:21:20 GMT
- Title: Valid causal inference with unobserved confounding in high-dimensional
settings
- Authors: Niloofar Moosavi, Tetiana Gorbach, Xavier de Luna
- Abstract summary: We show how valid semiparametric inference can be obtained in the presence of unobserved confounders and high-dimensional nuisance models.
We propose uncertainty intervals which allow for unobserved confounding, and show that the resulting inference is valid when the amount of unobserved confounding is small.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Various methods have recently been proposed to estimate causal effects with
confidence intervals that are uniformly valid over a set of data generating
processes when high-dimensional nuisance models are estimated by
post-model-selection or machine learning estimators. These methods typically
require that all the confounders are observed to ensure identification of the
effects. We contribute by showing how valid semiparametric inference can be
obtained in the presence of unobserved confounders and high-dimensional
nuisance models. We propose uncertainty intervals which allow for unobserved
confounding, and show that the resulting inference is valid when the amount of
unobserved confounding is small relative to the sample size; the latter is
formalized in terms of convergence rates. Simulation experiments illustrate the
finite sample properties of the proposed intervals and investigate an
alternative procedure that improves the empirical coverage of the intervals
when the amount of unobserved confounding is large. Finally, a case study on
the effect of smoking during pregnancy on birth weight is used to illustrate
the use of the methods introduced to perform a sensitivity analysis to
unobserved confounding.
Related papers
- On the Identification of Temporally Causal Representation with Instantaneous Dependence [50.14432597910128]
Temporally causal representation learning aims to identify the latent causal process from time series observations.
Most methods require the assumption that the latent causal processes do not have instantaneous relations.
We propose an textbfIDentification framework for instantanetextbfOus textbfLatent dynamics.
arXiv Detail & Related papers (2024-05-24T08:08:05Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Identifiable causal inference with noisy treatment and no side information [6.432072145009342]
This study proposes a model that assumes a continuous treatment variable that is inaccurately measured.
We prove that our model's causal effect estimates are identifiable, even without side information and knowledge of the measurement error variance.
Our work extends the range of applications in which reliable causal inference can be conducted.
arXiv Detail & Related papers (2023-06-18T18:38:10Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Efficient estimation of weighted cumulative treatment effects by
double/debiased machine learning [3.086361225427304]
We propose a class of one-step cross-fitted double/debiased machine learning estimators for the weighted cumulative causal effect.
We apply the proposed methods to real-world observational data from a UK primary care database to compare the effects of anti-diabetic drugs on cancer outcomes.
arXiv Detail & Related papers (2023-05-03T18:19:18Z) - Monotonicity and Double Descent in Uncertainty Estimation with Gaussian
Processes [52.92110730286403]
It is commonly believed that the marginal likelihood should be reminiscent of cross-validation metrics and that both should deteriorate with larger input dimensions.
We prove that by tuning hyper parameters, the performance, as measured by the marginal likelihood, improves monotonically with the input dimension.
We also prove that cross-validation metrics exhibit qualitatively different behavior that is characteristic of double descent.
arXiv Detail & Related papers (2022-10-14T08:09:33Z) - Counterfactual inference for sequential experiments [17.817769460838665]
We consider after-study statistical inference for sequentially designed experiments wherein multiple units are assigned treatments for multiple time points.
Our goal is to provide inference guarantees for the counterfactual mean at the smallest possible scale.
We illustrate our theory via several simulations and a case study involving data from a mobile health clinical trial HeartSteps.
arXiv Detail & Related papers (2022-02-14T17:24:27Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - The Hidden Uncertainty in a Neural Networks Activations [105.4223982696279]
The distribution of a neural network's latent representations has been successfully used to detect out-of-distribution (OOD) data.
This work investigates whether this distribution correlates with a model's epistemic uncertainty, thus indicating its ability to generalise to novel inputs.
arXiv Detail & Related papers (2020-12-05T17:30:35Z) - MissDeepCausal: Causal Inference from Incomplete Data Using Deep Latent
Variable Models [14.173184309520453]
State-of-the-art methods for causal inference don't consider missing values.
Missing data require an adapted unconfoundedness hypothesis.
Latent confounders whose distribution is learned through variational autoencoders adapted to missing values are considered.
arXiv Detail & Related papers (2020-02-25T12:58:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.