Causal Effect Estimation with Latent Textual Treatments
- URL: http://arxiv.org/abs/2602.15730v1
- Date: Tue, 17 Feb 2026 17:06:12 GMT
- Title: Causal Effect Estimation with Latent Textual Treatments
- Authors: Omri Feldman, Amar Venugopal, Jann Spiess, Amir Feder,
- Abstract summary: We present an end-to-end pipeline for the generation and causal estimation of latent textual interventions.<n>Our work first performs hypothesis generation and steering via sparse autoencoders (SAEs), followed by robust causal estimation.
- Score: 9.451877252547197
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Understanding the causal effects of text on downstream outcomes is a central task in many applications. Estimating such effects requires researchers to run controlled experiments that systematically vary textual features. While large language models (LLMs) hold promise for generating text, producing and evaluating controlled variation requires more careful attention. In this paper, we present an end-to-end pipeline for the generation and causal estimation of latent textual interventions. Our work first performs hypothesis generation and steering via sparse autoencoders (SAEs), followed by robust causal estimation. Our pipeline addresses both computational and statistical challenges in text-as-treatment experiments. We demonstrate that naive estimation of causal effects suffers from significant bias as text inherently conflates treatment and covariate information. We describe the estimation bias induced in this setting and propose a solution based on covariate residualization. Our empirical results show that our pipeline effectively induces variation in target features and mitigates estimation error, providing a robust foundation for causal effect estimation in text-as-treatment settings.
Related papers
- Text Rationalization for Robust Causal Effect Estimation [4.125187280299246]
High-dimensional text poses unique challenges for causal identification and estimation.<n>Redundant or spurious textual features inflate dimensionality, producing extreme propensity scores, unstable weights, and inflated variance in effect estimates.<n>We address these challenges with Confounding-Aware Token Rationalization (CATR), a framework that selects a sparse necessary subset of tokens.
arXiv Detail & Related papers (2025-12-05T02:18:45Z) - CausalPFN: Amortized Causal Effect Estimation via In-Context Learning [19.54034651361769]
CausalPFN infers causal effects for new observational datasets out of the box.<n>Our approach achieves superior average performance on heterogeneous and average treatment effect estimation benchmarks.<n>CausalPFN provides calibrated uncertainty estimates to support reliable decision-making based on Bayesian principles.
arXiv Detail & Related papers (2025-06-09T16:31:06Z) - Do-PFN: In-Context Learning for Causal Effect Estimation [75.62771416172109]
We show that Prior-data fitted networks (PFNs) can be pre-trained on synthetic data to predict outcomes.<n>Our approach allows for the accurate estimation of causal effects without knowledge of the underlying causal graph.
arXiv Detail & Related papers (2025-06-06T12:43:57Z) - Data Fusion for Partial Identification of Causal Effects [62.56890808004615]
We propose a novel partial identification framework that enables researchers to answer key questions.<n>Is the causal effect positive or negative? and How severe must assumption violations be to overturn this conclusion?<n>We apply our framework to the Project STAR study, which investigates the effect of classroom size on students' third-grade standardized test performance.
arXiv Detail & Related papers (2025-05-30T07:13:01Z) - Black Box Causal Inference: Effect Estimation via Meta Prediction [56.277798874118425]
We frame causal inference as a dataset-level prediction problem, offloading algorithm design to the learning process.<n>We introduce, called black box causal inference (BBCI), builds estimators in a black-box manner by learning to predict causal effects from sampled dataset-effect pairs.<n>We demonstrate accurate estimation of average treatment effects (ATEs) and conditional average treatment effects (CATEs) with BBCI across several causal inference problems.
arXiv Detail & Related papers (2025-03-07T23:43:19Z) - End-To-End Causal Effect Estimation from Unstructured Natural Language Data [23.484226791467478]
We show how large, diverse observational text data can be mined with large language models (LLMs) to produce inexpensive causal effect estimates.
We introduce NATURAL, a novel family of causal effect estimators built with LLMs that operate over datasets of unstructured text.
Our results suggest that unstructured text data is a rich source of causal effect information, and NATURAL is a first step towards an automated pipeline to tap this resource.
arXiv Detail & Related papers (2024-07-09T16:38:48Z) - Benchmarking Bayesian Causal Discovery Methods for Downstream Treatment
Effect Estimation [137.3520153445413]
A notable gap exists in the evaluation of causal discovery methods, where insufficient emphasis is placed on downstream inference.
We evaluate seven established baseline causal discovery methods including a newly proposed method based on GFlowNets.
The results of our study demonstrate that some of the algorithms studied are able to effectively capture a wide range of useful and diverse ATE modes.
arXiv Detail & Related papers (2023-07-11T02:58:10Z) - Interpretable Deep Causal Learning for Moderation Effects [0.0]
We address the problem of interpretability and targeted regularization in causal machine learning models.
We propose a novel deep counterfactual learning architecture for estimating individual treatment effects.
arXiv Detail & Related papers (2022-06-21T11:21:09Z) - SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event
Data [83.50281440043241]
We study the problem of inferring heterogeneous treatment effects from time-to-event data.
We propose a novel deep learning method for treatment-specific hazard estimation based on balancing representations.
arXiv Detail & Related papers (2021-10-26T20:13:17Z) - Causal Effect Estimation using Variational Information Bottleneck [19.6760527269791]
Causal inference is to estimate the causal effect in a causal relationship when intervention is applied.
We propose a method to estimate Causal Effect by using Variational Information Bottleneck (CEVIB)
arXiv Detail & Related papers (2021-10-26T13:46:12Z) - Efficient Causal Inference from Combined Observational and
Interventional Data through Causal Reductions [68.6505592770171]
Unobserved confounding is one of the main challenges when estimating causal effects.
We propose a novel causal reduction method that replaces an arbitrary number of possibly high-dimensional latent confounders.
We propose a learning algorithm to estimate the parameterized reduced model jointly from observational and interventional data.
arXiv Detail & Related papers (2021-03-08T14:29:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.