Incorporating Interventional Independence Improves Robustness against Interventional Distribution Shift
- URL: http://arxiv.org/abs/2507.05412v2
- Date: Mon, 14 Jul 2025 20:01:54 GMT
- Title: Incorporating Interventional Independence Improves Robustness against Interventional Distribution Shift
- Authors: Gautam Sreekumar, Vishnu Naresh Boddeti,
- Abstract summary: Existing approaches treat interventional data like observational data, even when the underlying causal model is known.<n>We propose RepLIn, a training algorithm to explicitly enforce this statistical independence during interventions.
- Score: 14.497130575562698
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We consider the problem of learning robust discriminative representations of causally-related latent variables. In addition to observational data, the training dataset also includes interventional data obtained through targeted interventions on some of these latent variables to learn representations robust against the resulting interventional distribution shifts. Existing approaches treat interventional data like observational data, even when the underlying causal model is known, and ignore the independence relations that arise from these interventions. Since these approaches do not fully exploit the causal relational information resulting from interventions, they learn representations that produce large disparities in predictive performance on observational and interventional data, which worsens when the number of interventional training samples is limited. In this paper, (1) we first identify a strong correlation between this performance disparity and adherence of the representations to the independence conditions induced by the interventional causal model. (2) For linear models, we derive sufficient conditions on the proportion of interventional data in the training dataset, for which enforcing interventional independence between representations corresponding to the intervened node and its non-descendants lowers the error on interventional data. Combining these insights, (3) we propose RepLIn, a training algorithm to explicitly enforce this statistical independence during interventions. We demonstrate the utility of RepLIn on a synthetic dataset and on real image and text datasets on facial attribute classification and toxicity detection, respectively. Our experiments show that RepLIn is scalable with the number of nodes in the causal graph and is suitable to improve the robust representations against interventional distribution shifts of both continuous and discrete latent variables.
Related papers
- Learning Joint Interventional Effects from Single-Variable Interventions in Additive Models [49.567092222782435]
We show how to learn joint interventional effects using only observational data and single-variable interventions.<n>We propose a practical estimator that decomposes the causal effect into confounded and unconfounded contributions for each intervention variable.
arXiv Detail & Related papers (2025-06-05T12:20:50Z) - Causally Fair Node Classification on Non-IID Graph Data [9.363036392218435]
This paper addresses the prevalent challenge in fairness-aware ML algorithms.<n>We tackle the overlooked domain of non-IID, graph-based settings.<n>We develop the Message Passing Variational Autoencoder for Causal Inference.
arXiv Detail & Related papers (2025-05-03T02:05:51Z) - Learning Mixtures of Unknown Causal Interventions [14.788930098406027]
We consider the challenge of disentangling mixed interventional and observational data within Structural Equation Models (SEMs)
We demonstrate that conducting interventions, whether do or soft, yields distributions with sufficient diversity and properties to efficiently recovering each component within the mixture.
As a result, the causal graph can be identified up to its interventional Markov Equivalence Class, similar to scenarios where no noise influences the generation of interventional data.
arXiv Detail & Related papers (2024-10-31T21:25:11Z) - Identifiable Latent Neural Causal Models [82.14087963690561]
Causal representation learning seeks to uncover latent, high-level causal representations from low-level observed data.
We determine the types of distribution shifts that do contribute to the identifiability of causal representations.
We translate our findings into a practical algorithm, allowing for the acquisition of reliable latent causal representations.
arXiv Detail & Related papers (2024-03-23T04:13:55Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - Interventional Causal Representation Learning [75.18055152115586]
Causal representation learning seeks to extract high-level latent factors from low-level sensory data.
Can interventional data facilitate causal representation learning?
We show that interventional data often carries geometric signatures of the latent factors' support.
arXiv Detail & Related papers (2022-09-24T04:59:03Z) - Differentiable Causal Discovery Under Latent Interventions [3.867363075280544]
Recent work has shown promising results in causal discovery by leveraging interventional data with gradient-based methods, even when the intervened variables are unknown.
We envision a scenario with an extensive dataset sampled from multiple intervention distributions and one observation distribution, but where we do not know which distribution originated each sample and how the intervention affected the system.
We propose a method based on neural networks and variational inference that addresses this scenario by framing it as learning a shared causal graph among an infinite mixture.
arXiv Detail & Related papers (2022-03-04T14:21:28Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.