A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation
- URL: http://arxiv.org/abs/2505.11444v1
- Date: Fri, 16 May 2025 17:00:52 GMT
- Title: A Generative Framework for Causal Estimation via Importance-Weighted Diffusion Distillation
- Authors: Xinran Song, Tianyu Chen, Mingyuan Zhou,
- Abstract summary: Estimating individualized treatment effects from observational data is a central challenge in causal inference.<n>In inverse probability weighting (IPW) is a well-established solution to this problem, but its integration into modern deep learning frameworks remains limited.<n>We propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation.
- Score: 55.53426007439564
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Estimating individualized treatment effects from observational data is a central challenge in causal inference, largely due to covariate imbalance and confounding bias from non-randomized treatment assignment. While inverse probability weighting (IPW) is a well-established solution to this problem, its integration into modern deep learning frameworks remains limited. In this work, we propose Importance-Weighted Diffusion Distillation (IWDD), a novel generative framework that combines the pretraining of diffusion models with importance-weighted score distillation to enable accurate and fast causal estimation-including potential outcome prediction and treatment effect estimation. We demonstrate how IPW can be naturally incorporated into the distillation of pretrained diffusion models, and further introduce a randomization-based adjustment that eliminates the need to compute IPW explicitly-thereby simplifying computation and, more importantly, provably reducing the variance of gradient estimates. Empirical results show that IWDD achieves state-of-the-art out-of-sample prediction performance, with the highest win rates compared to other baselines, significantly improving causal estimation and supporting the development of individualized treatment strategies. We will release our PyTorch code for reproducibility and future research.
Related papers
- DFW: A Novel Weighting Scheme for Covariate Balancing and Treatment Effect Estimation [0.0]
Estimating causal effects from observational data is challenging due to selection bias.<n>We propose Deconfounding Factor Weighting (DFW), a novel propensity score-based approach.<n>DFW prioritizes less confounded samples while mitigating the influence of highly confounded ones.
arXiv Detail & Related papers (2025-08-07T09:51:55Z) - Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective [65.10019978876863]
Diffusion-Based Purification (DBP) has emerged as an effective defense mechanism against adversarial attacks.<n>In this paper, we propose that the intrinsicity in the DBP process is the primary factor driving robustness.
arXiv Detail & Related papers (2024-04-22T16:10:38Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - High Precision Causal Model Evaluation with Conditional Randomization [10.23470075454725]
We introduce a novel low-variance estimator for causal error, dubbed as the pairs estimator.
By applying the same IPW estimator to both the model and true experimental effects, our estimator effectively cancels out the variance due to IPW and achieves a smaller variance.
Our method offers a simple yet powerful solution to evaluate causal inference models in conditional randomization settings without complicated modification of the IPW estimator itself.
arXiv Detail & Related papers (2023-11-03T13:22:27Z) - Unmasking Bias in Diffusion Model Training [40.90066994983719]
Denoising diffusion models have emerged as a dominant approach for image generation.
They still suffer from slow convergence in training and color shift issues in sampling.
In this paper, we identify that these obstacles can be largely attributed to bias and suboptimality inherent in the default training paradigm.
arXiv Detail & Related papers (2023-10-12T16:04:41Z) - Reconstructing Graph Diffusion History from a Single Snapshot [87.20550495678907]
We propose a novel barycenter formulation for reconstructing Diffusion history from A single SnapsHot (DASH)
We prove that estimation error of diffusion parameters is unavoidable due to NP-hardness of diffusion parameter estimation.
We also develop an effective solver named DIffusion hiTting Times with Optimal proposal (DITTO)
arXiv Detail & Related papers (2023-06-01T09:39:32Z) - Covariate-Balancing-Aware Interpretable Deep Learning models for
Treatment Effect Estimation [15.465045049754336]
We propose an upper bound for the bias of average treatment estimation under the strong ignorability assumption.
We implement this upper bound as an objective function being minimized by leveraging a novel additive neural network architecture.
The proposed method is illustrated by re-examining the benchmark datasets for causal inference, and it outperforms the state-of-art.
arXiv Detail & Related papers (2022-03-07T07:42:40Z) - When in Doubt: Neural Non-Parametric Uncertainty Quantification for
Epidemic Forecasting [70.54920804222031]
Most existing forecasting models disregard uncertainty quantification, resulting in mis-calibrated predictions.
Recent works in deep neural models for uncertainty-aware time-series forecasting also have several limitations.
We model the forecasting task as a probabilistic generative process and propose a functional neural process model called EPIFNP.
arXiv Detail & Related papers (2021-06-07T18:31:47Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z) - Estimating heterogeneous survival treatment effect in observational data
using machine learning [9.951103976634407]
Methods for estimating heterogeneous treatment effect in observational data have largely focused on continuous or binary outcomes.
Using flexible machine learning methods in the counterfactual framework is a promising approach to address challenges due to complex individual characteristics.
arXiv Detail & Related papers (2020-08-17T01:02:14Z) - The Counterfactual $\chi$-GAN [20.42556178617068]
Causal inference often relies on the counterfactual framework, which requires that treatment assignment is independent of the outcome.
This work proposes a generative adversarial network (GAN)-based model called the Counterfactual $chi$-GAN (cGAN)
arXiv Detail & Related papers (2020-01-09T17:23:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.