On the Effects of Irrelevant Variables in Treatment Effect Estimation with Deep Disentanglement
- URL: http://arxiv.org/abs/2407.20003v2
- Date: Mon, 26 Aug 2024 08:10:56 GMT
- Title: On the Effects of Irrelevant Variables in Treatment Effect Estimation with Deep Disentanglement
- Authors: Ahmad Saeed Khan, Erik Schaffernicht, Johannes Andreas Stork,
- Abstract summary: Estimating treatment effects from observational data is paramount in healthcare, education, and economics.
Current deep disentanglement-based methods to address selection bias are insufficiently handling irrelevant variables.
We disentangle pre-treatment variables with a deep embedding method and explicitly identify and represent irrelevant variables.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating treatment effects from observational data is paramount in healthcare, education, and economics, but current deep disentanglement-based methods to address selection bias are insufficiently handling irrelevant variables. We demonstrate in experiments that this leads to prediction errors. We disentangle pre-treatment variables with a deep embedding method and explicitly identify and represent irrelevant variables, additionally to instrumental, confounding and adjustment latent factors. To this end, we introduce a reconstruction objective and create an embedding space for irrelevant variables using an attached autoencoder. Instead of relying on serendipitous suppression of irrelevant variables as in previous deep disentanglement approaches, we explicitly force irrelevant variables into this embedding space and employ orthogonalization to prevent irrelevant information from leaking into the latent space representations of the other factors. Our experiments with synthetic and real-world benchmark datasets show that we can better identify irrelevant variables and more precisely predict treatment effects than previous methods, while prediction quality degrades less when additional irrelevant variables are introduced.
Related papers
- How important are the genes to explain the outcome - the asymmetric Shapley value as an honest importance metric for high-dimensional features [1.0499611180329804]
In clinical prediction settings the importance of a high-dimensional feature like genomics is often assessed by evaluating the change in predictive performance.<n>We suggest to use asymmetric Shapley values as a more suitable alternative to quantify feature importance in the context of a mixed-dimensional prediction model.
arXiv Detail & Related papers (2026-03-05T15:58:50Z) - PIPCFR: Pseudo-outcome Imputation with Post-treatment Variables for Individual Treatment Effect Estimation [19.72208057455035]
We introduce Pseudo-outcome Imputation with Post-treatment Variables for Counterfactual Regression (PIPCFR), a novel approach that incorporates post-treatment variables to improve pseudo-outcome imputation.<n>We analyze the challenges inherent in utilizing post-treatment variables and establish a novel theoretical bound for ITE risk that explicitly connects post-treatment variables to ITE estimation accuracy.
arXiv Detail & Related papers (2025-12-21T13:57:26Z) - Temporal Latent Variable Structural Causal Model for Causal Discovery under External Interferences [53.308122815325326]
We introduce latent variables to represent unobserved factors that affect the observed data.<n>Specifically, to capture the causal strength and adjacency information, we propose a new temporal latent variable structural causal model.<n>Considering that expert knowledge can provide information about unknown interferences in certain scenarios, we develop a method that facilitates the incorporation of prior knowledge into parameter learning.
arXiv Detail & Related papers (2025-11-13T07:10:10Z) - Disentangled Graph Autoencoder for Treatment Effect Estimation [1.361700725822891]
We propose a novel disentangled variational graph autoencoder for treatment effect estimation on networked observational data.
Our graph encoder disentangles latent factors into instrumental, confounding, adjustment, and noisy factors, while enforcing factor independence using the Hilbert-Schmidt Independence Criterion.
arXiv Detail & Related papers (2024-12-19T03:44:49Z) - Local Learning for Covariate Selection in Nonparametric Causal Effect Estimation with Latent Variables [13.12743473333296]
Estimating causal effects from nonexperimental data is a fundamental problem in many fields of science.
We propose a novel local learning approach for covariate selection in nonparametric causal effect estimation.
We validate our algorithm through extensive experiments on both synthetic and real-world data.
arXiv Detail & Related papers (2024-11-25T12:08:54Z) - Causal Effect Estimation using identifiable Variational AutoEncoder with Latent Confounders and Post-Treatment Variables [18.34462010115951]
Estimating causal effects from observational data is challenging, especially in the presence of latent confounders.
We propose a novel method of joint Variational AutoEncoder (VAE) and identifiable Variational AutoEncoder (iVAE) for learning the representations of latent confounders and latent post-treatment variables.
arXiv Detail & Related papers (2024-08-13T22:13:25Z) - Disentangled Representation via Variational AutoEncoder for Continuous Treatment Effect Estimation [1.105274635981989]
We propose a novel Dose-Response curve estimator via Variational AutoEncoder (DRVAE)
We show that our model outperforms the current state-of-the-art methods.
arXiv Detail & Related papers (2024-06-04T13:41:07Z) - Challenges in Variable Importance Ranking Under Correlation [6.718144470265263]
We present a comprehensive simulation study investigating the impact of feature correlation on the assessment of variable importance.
While there is always no correlation between knockoff variables and its corresponding predictor variables, we prove that the correlation increases linearly beyond a certain correlation threshold between the predictor variables.
arXiv Detail & Related papers (2024-02-05T19:02:13Z) - Selective Nonparametric Regression via Testing [54.20569354303575]
We develop an abstention procedure via testing the hypothesis on the value of the conditional variance at a given point.
Unlike existing methods, the proposed one allows to account not only for the value of the variance itself but also for the uncertainty of the corresponding variance predictor.
arXiv Detail & Related papers (2023-09-28T13:04:11Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - To Impute or not to Impute? -- Missing Data in Treatment Effect
Estimation [84.76186111434818]
We identify a new missingness mechanism, which we term mixed confounded missingness (MCM), where some missingness determines treatment selection and other missingness is determined by treatment selection.
We show that naively imputing all data leads to poor performing treatment effects models, as the act of imputation effectively removes information necessary to provide unbiased estimates.
Our solution is selective imputation, where we use insights from MCM to inform precisely which variables should be imputed and which should not.
arXiv Detail & Related papers (2022-02-04T12:08:31Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning [107.70165026669308]
In offline reinforcement learning (RL) an optimal policy is learned solely from a priori collected observational data.
We study a confounded Markov decision process where the transition dynamics admit an additive nonlinear functional form.
We propose a provably efficient IV-aided Value Iteration (IVVI) algorithm based on a primal-dual reformulation of the conditional moment restriction.
arXiv Detail & Related papers (2021-02-19T13:01:40Z) - Stable Prediction via Leveraging Seed Variable [73.9770220107874]
Previous machine learning methods might exploit subtly spurious correlations in training data induced by non-causal variables for prediction.
We propose a conditional independence test based algorithm to separate causal variables with a seed variable as priori, and adopt them for stable prediction.
Our algorithm outperforms state-of-the-art methods for stable prediction.
arXiv Detail & Related papers (2020-06-09T06:56:31Z) - Causal query in observational data with hidden variables [0.0]
We develop a theorem for using local search to find a superset of the adjustment variables for causal effect estimation from observational data.
Based on the developed theorem, we propose a data-driven algorithm for causal query.
Experiments show that the proposed algorithm is faster and produces better causal effect estimation than an existing data-driven causal effect estimation method with hidden variables.
arXiv Detail & Related papers (2020-01-28T11:23:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.