Related papers: An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation

URL: http://arxiv.org/abs/2510.25128v1
Date: Wed, 29 Oct 2025 03:17:19 GMT
Title: An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation
Authors: Uzair Akbar, Niki Kilbertus, Hao Shen, Krikamol Muandet, Bo Dai,
Abstract summary: The technique of data augmentation (DA) is often used in machine learning for regularization purposes.<n>We present a unifying framework with topics in causal inference to make a case for the use of DA beyond just the i.i.d. setting.
Score: 21.69577679759595
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The technique of data augmentation (DA) is often used in machine learning for regularization purposes to better generalize under i.i.d. settings. In this work, we present a unifying framework with topics in causal inference to make a case for the use of DA beyond just the i.i.d. setting, but for generalization across interventions as well. Specifically, we argue that when the outcome generating mechanism is invariant to our choice of DA, then such augmentations can effectively be thought of as interventions on the treatment generating mechanism itself. This can potentially help to reduce bias in causal effect estimation arising from hidden confounders. In the presence of such unobserved confounding we typically make use of instrumental variables (IVs) -- sources of treatment randomization that are conditionally independent of the outcome. However, IVs may not be as readily available as DA for many applications, which is the main motivation behind this work. By appropriately regularizing IV based estimators, we introduce the concept of IV-like (IVL) regression for mitigating confounding bias and improving predictive performance across interventions even when certain IV properties are relaxed. Finally, we cast parameterized DA as an IVL regression problem and show that when used in composition can simulate a worst-case application of such DA, further improving performance on causal estimation and generalization tasks beyond what simple DA may offer. This is shown both theoretically for the population case and via simulation experiments for the finite sample case using a simple linear example. We also present real data experiments to support our case.

Related papers

Flow IV: Counterfactual Inference In Nonseparable Outcome Models Using Instrumental Variables [2.3213238782019316]
We show that under standard IV assumptions, along with the assumptions that latent noises in treatment and outcome are strictly monotonic and jointly Gaussian, the treatment-outcome relationship becomes uniquely identifiable from observed data.<n>This enables counterfactual inference even in nonseparable models.<n>We implement our approach by training a normalizing flow to maximize the likelihood of the observed data, demonstrating accurate recovery of the underlying outcome function.
arXiv Detail & Related papers (2025-08-02T11:24:03Z)
A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation [55.53426007439564]
Estimating individualized treatment effects from observational data is a central challenge in causal inference.<n>In inverse probability weighting (IPW) is a well-established solution to this problem, but its integration into modern deep learning frameworks remains limited.<n>We propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation.
arXiv Detail & Related papers (2025-05-16T17:00:52Z)
Sequential Treatment Effect Estimation with Unmeasured Confounders [24.064743106746885]
This paper studies the cumulative causal effects of sequential treatments in the presence of unmeasured confounders.<n>We propose a novel Decomposing Sequential Instrumental Variable framework for CounterFactual Regression.
arXiv Detail & Related papers (2025-05-14T03:42:43Z)
Distributional Instrumental Variable Method [4.34680331569334]
The aim of this work is to estimate the entire interventional distribution.<n>We propose a method called Distributional Instrumental Variable (DIV), which uses generative modelling in a nonlinear IV setting.
arXiv Detail & Related papers (2025-02-11T15:33:06Z)
Estimating Heterogeneous Treatment Effects by Combining Weak Instruments and Observational Data [44.31792000298105]
Accurately predicting conditional average treatment effects (CATEs) is crucial in personalized medicine and digital platform analytics. We develop a novel approach to combine IV and observational data to enable reliable CATE estimation.
arXiv Detail & Related papers (2024-06-10T16:40:55Z)
Geometry-Aware Instrumental Variable Regression [56.16884466478886]
We propose a transport-based IV estimator that takes into account the geometry of the data manifold through data-derivative information. We provide a simple plug-and-play implementation of our method that performs on par with related estimators in standard settings.
arXiv Detail & Related papers (2024-05-19T17:49:33Z)
Regularized DeepIV with Model Selection [72.17508967124081]
Regularized DeepIV (RDIV) regression can converge to the least-norm IV solution. Our method matches the current state-of-the-art convergence rate.
arXiv Detail & Related papers (2024-03-07T05:38:56Z)
Approximating Counterfactual Bounds while Fusing Observational, Biased and Randomised Data Sources [64.96984404868411]
We address the problem of integrating data from multiple, possibly biased, observational and interventional studies. We show that the likelihood of the available data has no local maxima. We then show how the same approach can address the general case of multiple datasets.
arXiv Detail & Related papers (2023-07-31T11:28:24Z)
Advancing Counterfactual Inference through Nonlinear Quantile Regression [77.28323341329461]
We propose a framework for efficient and effective counterfactual inference implemented with neural networks. The proposed approach enhances the capacity to generalize estimated counterfactual outcomes to unseen data. Empirical results conducted on multiple datasets offer compelling support for our theoretical assertions.
arXiv Detail & Related papers (2023-06-09T08:30:51Z)
Estimating individual treatment effects under unobserved confounding using binary instruments [21.563820572163337]
Estimating individual treatment effects (ITEs) from observational data is relevant in many fields such as personalized medicine. We propose a novel, multiply robust machine learning framework, called MRIV, for estimating ITEs using binary IVs.
arXiv Detail & Related papers (2022-08-17T21:25:09Z)
Identifying Causal Effects using Instrumental Time Series: Nuisance IV and Correcting for the Past [12.49477539101379]
We consider IV regression in time series models, such as vector auto-regressive ( VAR) processes. Direct applications of i.i.d. techniques are generally inconsistent as they do not correctly adjust for dependencies in the past. We provide methods, prove their consistency, and show how the inferred causal effect can be used for distribution generalization.
arXiv Detail & Related papers (2022-03-11T16:29:48Z)
Efficient Causal Inference from Combined Observational and Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects. We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders. We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z)
Instrumental Variable Value Iteration for Causal Offline Reinforcement Learning [107.70165026669308]
In offline reinforcement learning (RL) an optimal policy is learned solely from a priori collected observational data. We study a confounded Markov decision process where the transition dynamics admit an additive nonlinear functional form. We propose a provably efficient IV-aided Value Iteration (IVVI) algorithm based on a primal-dual reformulation of the conditional moment restriction.
arXiv Detail & Related papers (2021-02-19T13:01:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.