ACTIVA: Amortized Causal Effect Estimation via Transformer-based Variational Autoencoder
- URL: http://arxiv.org/abs/2503.01290v2
- Date: Fri, 08 Aug 2025 10:12:06 GMT
- Title: ACTIVA: Amortized Causal Effect Estimation via Transformer-based Variational Autoencoder
- Authors: Andreas Sauter, Saber Salehkaleybar, Aske Plaat, Erman Acar,
- Abstract summary: We propose ACTIVA, a conditional variational autoencoder architecture for amortized causal inference.<n>ACTIVA learns a latent representation conditioned on observational inputs and intervention queries, enabling zero-shot inference.<n>We provide theoretical insights showing that ACTIVA predicts interventional distributions as mixtures over observationally equivalent causal models.
- Score: 7.987204219322316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Predicting the distribution of outcomes under hypothetical interventions is crucial across healthcare, economics, and policy-making. However, existing methods often require restrictive assumptions, and are typically limited by the lack of amortization across problem instances. We propose ACTIVA, a transformer-based conditional variational autoencoder (VAE) architecture for amortized causal inference, which estimates interventional distributions directly from observational data without. ACTIVA learns a latent representation conditioned on observational inputs and intervention queries, enabling zero-shot inference by amortizing causal knowledge from diverse training scenarios. We provide theoretical insights showing that ACTIVA predicts interventional distributions as mixtures over observationally equivalent causal models. Empirical evaluations on synthetic and semi-synthetic datasets confirm the effectiveness of our amortized approach and highlight promising directions for future real-world applications.
Related papers
- Causally Fair Node Classification on Non-IID Graph Data [9.363036392218435]
This paper addresses the prevalent challenge in fairness-aware ML algorithms.<n>We tackle the overlooked domain of non-IID, graph-based settings.<n>We develop the Message Passing Variational Autoencoder for Causal Inference.
arXiv Detail & Related papers (2025-05-03T02:05:51Z) - Generative Intervention Models for Causal Perturbation Modeling [80.72074987374141]
In many applications, it is a priori unknown which mechanisms of a system are modified by an external perturbation.
We propose a generative intervention model (GIM) that learns to map these perturbation features to distributions over atomic interventions.
arXiv Detail & Related papers (2024-11-21T10:37:57Z) - Exogenous Matching: Learning Good Proposals for Tractable Counterfactual Estimation [1.9662978733004601]
We propose an importance sampling method for tractable and efficient estimation of counterfactual expressions.<n>By minimizing a common upper bound of counterfactual estimators, we transform the variance minimization problem into a conditional distribution learning problem.<n>We validate the theoretical results through experiments under various types and settings of Structural Causal Models (SCMs) and demonstrate the outperformance on counterfactual estimation tasks.
arXiv Detail & Related papers (2024-10-17T03:08:28Z) - Intervention-Aware Forecasting: Breaking Historical Limits from a System Perspective [13.150057548030558]
We propose an Intervention-Aware Time Series Forecasting (IATSF) framework explicitly designed to incorporate external interventions.<n>We propose a leak-free benchmark composed of temporally synchronized textual intervention data across synthetic and real-world scenarios.<n>We show that FIATS surpasses state-of-the-art methods, highlighting that forecasting improvements stem explicitly from modeling external interventions.
arXiv Detail & Related papers (2024-05-22T10:45:50Z) - Sample, estimate, aggregate: A recipe for causal discovery foundation models [28.116832159265964]
We train a supervised model that learns to predict a larger causal graph from the outputs of classical causal discovery algorithms run over subsets of variables.
Our approach is enabled by the observation that typical errors in the outputs of classical methods remain comparable across datasets.
Experiments on real and synthetic data demonstrate that this model maintains high accuracy in the face of misspecification or distribution shift.
arXiv Detail & Related papers (2024-02-02T21:57:58Z) - Flexible Nonparametric Inference for Causal Effects under the Front-Door Model [2.6900047294457683]
We develop novel one-step and targeted minimum loss-based estimators for both the average treatment effect and the average treatment effect on the treated under front-door assumptions.<n>Our estimators are built on multiple parameterizations of the observed data distribution, including approaches that avoid mediator density entirely.<n>We show how these constraints can be leveraged to improve the efficiency of causal effect estimators.
arXiv Detail & Related papers (2023-12-15T22:04:53Z) - Towards Characterizing Domain Counterfactuals For Invertible Latent Causal Models [15.817239008727789]
In this work, we analyze a specific type of causal query called domain counterfactuals, which hypothesizes what a sample would have looked like if it had been generated in a different domain.
We show that recovering the latent Structural Causal Model (SCM) is unnecessary for estimating domain counterfactuals.
We also develop a theoretically grounded practical algorithm that simplifies the modeling process to generative model estimation.
arXiv Detail & Related papers (2023-06-20T04:19:06Z) - Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks.
The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data.
Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z) - Nonparametric Identifiability of Causal Representations from Unknown
Interventions [63.1354734978244]
We study causal representation learning, the task of inferring latent causal variables and their causal relations from mixtures of the variables.
Our goal is to identify both the ground truth latents and their causal graph up to a set of ambiguities which we show to be irresolvable from interventional data.
arXiv Detail & Related papers (2023-06-01T10:51:58Z) - CausalBench: A Large-scale Benchmark for Network Inference from
Single-cell Perturbation Data [61.088705993848606]
We introduce CausalBench, a benchmark suite for evaluating causal inference methods on real-world interventional data.
CaulBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics.
arXiv Detail & Related papers (2022-10-31T13:04:07Z) - Data-Driven Influence Functions for Optimization-Based Causal Inference [105.5385525290466]
We study a constructive algorithm that approximates Gateaux derivatives for statistical functionals by finite differencing.
We study the case where probability distributions are not known a priori but need to be estimated from data.
arXiv Detail & Related papers (2022-08-29T16:16:22Z) - Neuro-Symbolic Entropy Regularization [78.16196949641079]
In structured prediction, the goal is to jointly predict many output variables that together encode a structured object.
One approach -- entropy regularization -- posits that decision boundaries should lie in low-probability regions.
We propose a loss, neuro-symbolic entropy regularization, that encourages the model to confidently predict a valid object.
arXiv Detail & Related papers (2022-01-25T06:23:10Z) - Estimation of Bivariate Structural Causal Models by Variational Gaussian
Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models.
One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z) - Which Invariance Should We Transfer? A Causal Minimax Learning Approach [18.71316951734806]
We present a comprehensive minimax analysis from a causal perspective.
We propose an efficient algorithm to search for the subset with minimal worst-case risk.
The effectiveness and efficiency of our methods are demonstrated on synthetic data and the diagnosis of Alzheimer's disease.
arXiv Detail & Related papers (2021-07-05T09:07:29Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Identification of Latent Variables From Graphical Model Residuals [0.0]
We present a novel method to control for the latent space when estimating a DAG by iteratively deriving proxies for the latent space from the residuals of the inferred model.
We show that any improvement of prediction of an outcome is intrinsically capped and cannot rise beyond a certain limit as compared to the confounded model.
arXiv Detail & Related papers (2021-01-07T02:28:49Z) - Causal Autoregressive Flows [4.731404257629232]
We highlight an intrinsic correspondence between a simple family of autoregressive normalizing flows and identifiable causal models.
We exploit the fact that autoregressive flow architectures define an ordering over variables, analogous to a causal ordering, to show that they are well-suited to performing a range of causal inference tasks.
arXiv Detail & Related papers (2020-11-04T13:17:35Z) - Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process.
We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z) - Autoregressive flow-based causal discovery and inference [4.83420384410068]
Autoregressive flow models are well-suited to performing a range of causal inference tasks.
We exploit the fact that autoregressive architectures define an ordering over variables, analogous to a causal ordering.
We present examples over synthetic data where autoregressive flows, when trained under the correct causal ordering, are able to make accurate interventional and counterfactual predictions.
arXiv Detail & Related papers (2020-07-18T10:02:59Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z) - Estimating the Effects of Continuous-valued Interventions using
Generative Adversarial Networks [103.14809802212535]
We build on the generative adversarial networks (GANs) framework to address the problem of estimating the effect of continuous-valued interventions.
Our model, SCIGAN, is flexible and capable of simultaneously estimating counterfactual outcomes for several different continuous interventions.
To address the challenges presented by shifting to continuous interventions, we propose a novel architecture for our discriminator.
arXiv Detail & Related papers (2020-02-27T18:46:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.