Related papers: Automating the Selection of Proxy Variables of Unmeasured Confounders

Automating the Selection of Proxy Variables of Unmeasured Confounders

URL: http://arxiv.org/abs/2405.16130v1
Date: Sat, 25 May 2024 08:53:49 GMT
Title: Automating the Selection of Proxy Variables of Unmeasured Confounders
Authors: Feng Xie, Zhengming Chen, Shanshan Luo, Wang Miao, Ruichu Cai, Zhi Geng,
Abstract summary: We extend the existing proxy variable estimator to accommodate scenarios where multiple unmeasured confounders exist between the treatments and the outcome. We propose two data-driven methods for the selection of proxy variables and for the unbiased estimation of causal effects.
Score: 16.773841751009748
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recently, interest has grown in the use of proxy variables of unobserved confounding for inferring the causal effect in the presence of unmeasured confounders from observational data. One difficulty inhibiting the practical use is finding valid proxy variables of unobserved confounding to a target causal effect of interest. These proxy variables are typically justified by background knowledge. In this paper, we investigate the estimation of causal effects among multiple treatments and a single outcome, all of which are affected by unmeasured confounders, within a linear causal model, without prior knowledge of the validity of proxy variables. To be more specific, we first extend the existing proxy variable estimator, originally addressing a single unmeasured confounder, to accommodate scenarios where multiple unmeasured confounders exist between the treatments and the outcome. Subsequently, we present two different sets of precise identifiability conditions for selecting valid proxy variables of unmeasured confounders, based on the second-order statistics and higher-order statistics of the data, respectively. Moreover, we propose two data-driven methods for the selection of proxy variables and for the unbiased estimation of causal effects. Theoretical analysis demonstrates the correctness of our proposed algorithms. Experimental results on both synthetic and real-world data show the effectiveness of the proposed approach.

Related papers

Data Fusion for Partial Identification of Causal Effects [62.56890808004615]
We propose a novel partial identification framework that enables researchers to answer key questions.<n>Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion?<n>We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance.
arXiv Detail & Related papers (2025-05-30T07:13:01Z)
Proximal Inference on Population Intervention Indirect Effect [8.296034406842345]
The population intervention indirect effect (PIIE) is a novel mediation effect representing the indirect component of the population intervention effect. This study extends the PIIE identification to settings where unmeasured confounders influence exposure-outcome, exposure-mediator, and mediator-outcome relationships.
arXiv Detail & Related papers (2025-04-16T08:14:55Z)
Regression-Based Estimation of Causal Effects in the Presence of Selection Bias and Confounding [52.1068936424622]
We consider the problem of estimating the expected causal effect $E[Y|do(X)]$ for a target variable $Y$ when treatment $X$ is set by intervention. In settings without selection bias or confounding, $E[Y|do(X)] = E[Y|X]$, which can be estimated using standard regression methods. We propose a framework that incorporates both selection bias and confounding.
arXiv Detail & Related papers (2025-03-26T13:43:37Z)
Density Ratio-based Proxy Causal Learning Without Density Ratios [26.49087216375106]
We address the setting of Proxy Causal Learning (PCL), which has the goal of estimating causal effects from observed data in the presence of hidden confounding. Two approaches have been proposed to perform causal effect estimation given proxy variables. We propose a practical and effective implementation of the second approach, which bypasses explicit density ratio estimation and is suitable for continuous and high-dimensional treatments.
arXiv Detail & Related papers (2025-03-11T12:27:54Z)
Black Box Causal Inference: Effect Estimation via Meta Prediction [56.277798874118425]
We frame causal inference as a dataset-level prediction problem, offloading algorithm design to the learning process. We introduce, called black box causal inference (BBCI), builds estimators in a black-box manner by learning to predict causal effects from sampled dataset-effect pairs. We demonstrate accurate estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) with BBCI across several causal inference problems.
arXiv Detail & Related papers (2025-03-07T23:43:19Z)
Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables [13.12743473333296]
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science. We propose a novel local learning approach for covariate selection in nonparametric causal effect estimation. We validate our algorithm through extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-11-25T12:08:54Z)
Indiscriminate Disruption of Conditional Inference on Multivariate Gaussians [60.22542847840578]
Despite advances in adversarial machine learning, inference for Gaussian models in the presence of an adversary is notably understudied. We consider a self-interested attacker who wishes to disrupt a decisionmaker's conditional inference and subsequent actions by corrupting a set of evidentiary variables. To avoid detection, the attacker also desires the attack to appear plausible wherein plausibility is determined by the density of the corrupted evidence.
arXiv Detail & Related papers (2024-11-21T17:46:55Z)
Federated Causal Discovery from Heterogeneous Data [70.31070224690399]
We propose a novel FCD method attempting to accommodate arbitrary causal models and heterogeneous data. These approaches involve constructing summary statistics as a proxy of the raw data to protect data privacy. We conduct extensive experiments on synthetic and real datasets to show the efficacy of our method.
arXiv Detail & Related papers (2024-02-20T18:53:53Z)
Causal Inference from Text: Unveiling Interactions between Variables [20.677407402398405]
Existing methods only account for confounding covariables that affect both treatment and outcome. This bias arises from insufficient consideration of non-confounding covariables. In this work, we aim to mitigate the bias by unveiling interactions between different variables.
arXiv Detail & Related papers (2023-11-09T11:29:44Z)
Kernel Single Proxy Control for Deterministic Confounding [32.70182383946395]
We show that a single proxy variable is sufficient for causal estimation if the outcome is generated deterministically. We prove and empirically demonstrate that we can successfully recover the causal effect on challenging synthetic benchmarks.
arXiv Detail & Related papers (2023-08-08T21:11:06Z)
Nonparametric Identifiability of Causal Representations from Unknown Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables. Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z)
Disentangled Representation for Causal Mediation Analysis [25.114619307838602]
Causal mediation analysis is a method that is often used to reveal direct and indirect effects. Deep learning shows promise in mediation analysis, but the current methods only assume latent confounders that affect treatment, mediator and outcome simultaneously. We propose the Disentangled Mediation Analysis Variational AutoEncoder (DMAVAE), which disentangles the representations of latent confounders into three types to accurately estimate the natural direct effect, natural indirect effect and total effect.
arXiv Detail & Related papers (2023-02-19T23:37:17Z)
Debiasing Recommendation by Learning Identifiable Latent Confounders [49.16119112336605]
Confounding bias arises due to the presence of unmeasured variables that can affect both a user's exposure and feedback. Existing methods either (1) make untenable assumptions about these unmeasured variables or (2) directly infer latent confounders from users' exposure. We propose a novel method, i.e., identifiable deconfounder (iDCF), which leverages a set of proxy variables to resolve the aforementioned non-identification issue.
arXiv Detail & Related papers (2023-02-10T05:10:26Z)
Valid Inference After Causal Discovery [73.87055989355737]
We develop tools for valid post-causal-discovery inference. We show that a naive combination of causal discovery and subsequent inference algorithms leads to highly inflated miscoverage rates.
arXiv Detail & Related papers (2022-08-11T17:40:45Z)
Combining Experimental and Observational Data for Identification of Long-Term Causal Effects [13.32091725929965]
We consider the task of estimating the causal effect of a treatment variable on a long-term outcome variable using data from an observational domain and an experimental domain. The observational data is assumed to be confounded and hence without further assumptions, this dataset alone cannot be used for causal inference either.
arXiv Detail & Related papers (2022-01-26T04:21:14Z)
Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction. We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction. Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.