Automating the Selection of Proxy Variables of Unmeasured Confounders
- URL: http://arxiv.org/abs/2405.16130v1
- Date: Sat, 25 May 2024 08:53:49 GMT
- Title: Automating the Selection of Proxy Variables of Unmeasured Confounders
- Authors: Feng Xie, Zhengming Chen, Shanshan Luo, Wang Miao, Ruichu Cai, Zhi Geng,
- Abstract summary: We extend the existing proxy variable estimator to accommodate scenarios where multiple unmeasured confounders exist between the treatments and the outcome.
We propose two data-driven methods for the selection of proxy variables and for the unbiased estimation of causal effects.
- Score: 16.773841751009748
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, interest has grown in the use of proxy variables of unobserved confounding for inferring the causal effect in the presence of unmeasured confounders from observational data. One difficulty inhibiting the practical use is finding valid proxy variables of unobserved confounding to a target causal effect of interest. These proxy variables are typically justified by background knowledge. In this paper, we investigate the estimation of causal effects among multiple treatments and a single outcome, all of which are affected by unmeasured confounders, within a linear causal model, without prior knowledge of the validity of proxy variables. To be more specific, we first extend the existing proxy variable estimator, originally addressing a single unmeasured confounder, to accommodate scenarios where multiple unmeasured confounders exist between the treatments and the outcome. Subsequently, we present two different sets of precise identifiability conditions for selecting valid proxy variables of unmeasured confounders, based on the second-order statistics and higher-order statistics of the data, respectively. Moreover, we propose two data-driven methods for the selection of proxy variables and for the unbiased estimation of causal effects. Theoretical analysis demonstrates the correctness of our proposed algorithms. Experimental results on both synthetic and real-world data show the effectiveness of the proposed approach.
Related papers
- Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data.
These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy.
We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z) - Causal Inference from Text: Unveiling Interactions between Variables [20.677407402398405]
Existing methods only account for confounding covariables that affect both treatment and outcome.
This bias arises from insufficient consideration of non-confounding covariables.
In this work, we aim to mitigate the bias by unveiling interactions between different variables.
arXiv Detail & Related papers (2023-11-09T11:29:44Z) - Kernel Single Proxy Control for Deterministic Confounding [32.70182383946395]
We show that a single proxy variable is sufficient for causal estimation if the outcome is generated deterministically.
We prove and empirically demonstrate that we can successfully recover the causal effect on challenging synthetic benchmarks.
arXiv Detail & Related papers (2023-08-08T21:11:06Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Disentangled Representation for Causal Mediation Analysis [25.114619307838602]
Causal mediation analysis is a method that is often used to reveal direct and indirect effects.
Deep learning shows promise in mediation analysis, but the current methods only assume latent confounders that affect treatment, mediator and outcome simultaneously.
We propose the Disentangled Mediation Analysis Variational AutoEncoder (DMAVAE), which disentangles the representations of latent confounders into three types to accurately estimate the natural direct effect, natural indirect effect and total effect.
arXiv Detail & Related papers (2023-02-19T23:37:17Z) - Debiasing Recommendation by Learning Identifiable Latent Confounders [49.16119112336605]
Confounding bias arises due to the presence of unmeasured variables that can affect both a user's exposure and feedback.
Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure.
We propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables to resolve the aforementioned non-identification issue.
arXiv Detail & Related papers (2023-02-10T05:10:26Z) - Valid Inference After Causal Discovery [73.87055989355737]
We develop tools for valid post-causal-discovery inference.
We show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates.
arXiv Detail & Related papers (2022-08-11T17:40:45Z) - Combining Experimental and Observational Data for Identification of
Long-Term Causal Effects [13.32091725929965]
We consider the task of estimating the causal effect of a treatment variable on a long-term outcome variable using data from an observational domain and an experimental domain.
The observational data is assumed to be confounded and hence without further assumptions, this dataset alone cannot be used for causal inference either.
arXiv Detail & Related papers (2022-01-26T04:21:14Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.