Representation Learning Preserving Ignorability and Covariate Matching for Treatment Effects
- URL: http://arxiv.org/abs/2504.20579v1
- Date: Tue, 29 Apr 2025 09:33:56 GMT
- Title: Representation Learning Preserving Ignorability and Covariate Matching for Treatment Effects
- Authors: Praharsh Nanavati, Ranjitha Prasad, Karthikeyan Shanmugam,
- Abstract summary: Estimating treatment effects from observational data is challenging due to hidden confounding.<n>A common framework to address both hidden confounding and selection bias is missing.
- Score: 18.60804431844023
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating treatment effects from observational data is challenging due to two main reasons: (a) hidden confounding, and (b) covariate mismatch (control and treatment groups not having identical distributions). Long lines of works exist that address only either of these issues. To address the former, conventional techniques that require detailed knowledge in the form of causal graphs have been proposed. For the latter, covariate matching and importance weighting methods have been used. Recently, there has been progress in combining testable independencies with partial side information for tackling hidden confounding. A common framework to address both hidden confounding and selection bias is missing. We propose neural architectures that aim to learn a representation of pre-treatment covariates that is a valid adjustment and also satisfies covariate matching constraints. We combine two different neural architectures: one based on gradient matching across domains created by subsampling a suitable anchor variable that assumes causal side information, followed by the other, a covariate matching transformation. We prove that approximately invariant representations yield approximate valid adjustment sets which would enable an interval around the true causal effect. In contrast to usual sensitivity analysis, where an unknown nuisance parameter is varied, we have a testable approximation yielding a bound on the effect estimate. We also outperform various baselines with respect to ATE and PEHE errors on causal benchmarks that include IHDP, Jobs, Cattaneo, and an image-based Crowd Management dataset.
Related papers
- A Two-Stage Interpretable Matching Framework for Causal Inference [0.6215404942415159]
Matching in causal inference from observational data aims to construct treatment and control groups with similar distributions of covariables.<n>We introduce a novel Two-stage Interpretable Matching framework for transparent and interpretable covariable matching.<n>We use these high- quality matches to estimate the conditional average treatment effects (CATEs)<n>Our results demonstrate that TIM improves CATE estimates, increases multivariate overlap, and scales effectively to high-dimensional data.
arXiv Detail & Related papers (2025-04-13T16:17:52Z) - Towards Self-Supervised Covariance Estimation in Deep Heteroscedastic Regression [102.24287051757469]
We study self-supervised covariance estimation in deep heteroscedastic regression.<n>We derive an upper bound on the 2-Wasserstein distance between normal distributions.<n>Experiments over a wide range of synthetic and real datasets demonstrate that the proposed 2-Wasserstein bound coupled with pseudo label annotations results in a computationally cheaper yet accurate deep heteroscedastic regression.
arXiv Detail & Related papers (2025-02-14T22:37:11Z) - Robustly estimating heterogeneity in factorial data using Rashomon Partitions [4.76518127830168]
We develop an alternative perspective, called Rashomon Partition Sets (RPSs)
RPSs incorporate all partitions that have posterior values near the maximum a posteriori partition, even if they offer substantively different explanations.
We apply our method to three empirical settings: price effects on charitable giving, chromosomal structure (telomere length) and the introduction of microfinance.
arXiv Detail & Related papers (2024-04-02T17:53:28Z) - Causal Inference from Text: Unveiling Interactions between Variables [20.677407402398405]
Existing methods only account for confounding covariables that affect both treatment and outcome.
This bias arises from insufficient consideration of non-confounding covariables.
In this work, we aim to mitigate the bias by unveiling interactions between different variables.
arXiv Detail & Related papers (2023-11-09T11:29:44Z) - TIC-TAC: A Framework for Improved Covariance Estimation in Deep Heteroscedastic Regression [109.69084997173196]
Deepscedastic regression involves jointly optimizing the mean and covariance of the predicted distribution using the negative log-likelihood.
Recent works show that this may result in sub-optimal convergence due to the challenges associated with covariance estimation.
We study two questions: (1) Does the predicted covariance truly capture the randomness of the predicted mean?
Our results show that not only does TIC accurately learn the covariance, it additionally facilitates an improved convergence of the negative log-likelihood.
arXiv Detail & Related papers (2023-10-29T09:54:03Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - CEnt: An Entropy-based Model-agnostic Explainability Framework to
Contrast Classifiers' Decisions [2.543865489517869]
We present a novel approach to locally contrast the prediction of any classifier.
Our Contrastive Entropy-based explanation method, CEnt, approximates a model locally by a decision tree to compute entropy information of different feature splits.
CEnt is the first non-gradient-based contrastive method generating diverse counterfactuals that do not necessarily exist in the training data while satisfying immutability (ex. race) and semi-immutability (ex. age can only change in an increasing direction)
arXiv Detail & Related papers (2023-01-19T08:23:34Z) - Partial Identification with Noisy Covariates: A Robust Optimization
Approach [94.10051154390237]
Causal inference from observational datasets often relies on measuring and adjusting for covariates.
We show that this robust optimization approach can extend a wide range of causal adjustment methods to perform partial identification.
Across synthetic and real datasets, we find that this approach provides ATE bounds with a higher coverage probability than existing methods.
arXiv Detail & Related papers (2022-02-22T04:24:26Z) - Deconfounding Scores: Feature Representations for Causal Effect
Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation.
We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data.
In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z) - Invariant Representation Learning for Treatment Effect Estimation [12.269209356327526]
We develop Nearly Invariant Causal Estimation (NICE)
NICE uses invariant risk minimization (IRM) [Arj19] to learn a representation of the covariates that, under some assumptions, strips out bad controls but preserves sufficient information to adjust for confounding.
We evaluate NICE on both synthetic and semi-synthetic data.
arXiv Detail & Related papers (2020-11-24T20:53:24Z) - Evaluating Prediction-Time Batch Normalization for Robustness under
Covariate Shift [81.74795324629712]
We call prediction-time batch normalization, which significantly improves model accuracy and calibration under covariate shift.
We show that prediction-time batch normalization provides complementary benefits to existing state-of-the-art approaches for improving robustness.
The method has mixed results when used alongside pre-training, and does not seem to perform as well under more natural types of dataset shift.
arXiv Detail & Related papers (2020-06-19T05:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.