Related papers: Language Models as Causal Effect Generators

Language Models as Causal Effect Generators

URL: http://arxiv.org/abs/2411.08019v2
Date: Mon, 22 Sep 2025 21:11:32 GMT
Title: Language Models as Causal Effect Generators
Authors: Lucius E. J. Bynum, Kyunghyun Cho,
Abstract summary: We present sequence-driven structural causal models (SD-SCMs)<n>An SD-SCM enables sampling from observational, interventional, and counterfactual distributions according to the desired causal structure.<n>We propose a new type of benchmark for causal inference methods, generating individual-level counterfactual data to test treatment effect estimation.
Score: 48.696932388555894
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: In this work, we present sequence-driven structural causal models (SD-SCMs), a framework for specifying causal models with user-defined structure and language-model-defined mechanisms. We characterize how an SD-SCM enables sampling from observational, interventional, and counterfactual distributions according to the desired causal structure. We then leverage this procedure to propose a new type of benchmark for causal inference methods, generating individual-level counterfactual data to test treatment effect estimation. We create an example benchmark consisting of thousands of datasets, and test a suite of popular estimation methods for average, conditional average, and individual treatment effect estimation. We find under this benchmark that (1) causal methods outperform non-causal methods and that (2) even state-of-the-art methods struggle with individualized effect estimation, suggesting this benchmark captures some inherent difficulties in causal estimation. Apart from generating data, this same technique can underpin the auditing of language models for (un)desirable causal effects, such as misinformation or discrimination. We believe SD-SCMs can serve as a useful tool in any application that would benefit from sequential data with controllable causal structure.

Related papers

SALAD: Improving Robustness and Generalization through Contrastive Learning with Structure-Aware and LLM-Driven Augmented Data [15.366930934639838]
We propose SALAD, a novel approach to enhance model robustness and generalization. Our method generates structure-aware and counterfactually augmented data for contrastive learning. We validate our approach through experiments on three tasks: Sentiment Classification, Sexism Detection, and Natural Language Inference.
arXiv Detail & Related papers (2025-04-16T15:40:10Z)
A Causal Inference Framework for Data Rich Environments [17.588417435132538]
We show how classic models for potential outcomes and treatment assignments fit within our framework.<n>For any estimator that has a fast enough estimation error rate for a certain nuisance parameter, we establish it is consistent for these various causal parameters.
arXiv Detail & Related papers (2025-04-02T13:04:26Z)
Model-free Methods for Event History Analysis and Efficient Adjustment (PhD Thesis) [55.2480439325792]
This thesis is a series of independent contributions to statistics unified by a model-free perspective.<n>The first chapter elaborates on how a model-free perspective can be used to formulate flexible methods that leverage prediction techniques from machine learning.<n>The second chapter studies the concept of local independence, which describes whether the evolution of one process is directly influenced by another.
arXiv Detail & Related papers (2025-02-11T19:24:09Z)
Influence Functions for Scalable Data Attribution in Diffusion Models [52.92223039302037]
Diffusion models have led to significant advancements in generative modelling.<n>Yet their widespread adoption poses challenges regarding data attribution and interpretability.<n>We develop an influence functions framework to address these challenges.
arXiv Detail & Related papers (2024-10-17T17:59:02Z)
Amortized Inference of Causal Models via Conditional Fixed-Point Iterations [17.427722515310606]
We propose amortized inference of Structural Causal Models (SCMs) by training a single model on multiple datasets sampled from different SCMs.<n>We first use a transformer-based architecture for amortized learning of dataset embeddings, and then extend the Fixed-Point Approach (FiP) to infer SCMs conditionally on their dataset embeddings.<n>As a byproduct, our method can generate observational and interventional data from novel SCMs at inference time, without updating parameters.
arXiv Detail & Related papers (2024-10-08T15:31:33Z)
Induced Covariance for Causal Discovery in Linear Sparse Structures [55.2480439325792]
Causal models seek to unravel the cause-effect relationships among variables from observed data. This paper introduces a novel causal discovery algorithm designed for settings in which variables exhibit linearly sparse relationships.
arXiv Detail & Related papers (2024-10-02T04:01:38Z)
Standardizing Structural Causal Models [80.21199731817698]
We propose internally-standardized structural causal models (iSCMs) for benchmarking algorithms. By construction, iSCMs are not $operatornameVar$-sortable, and as we show experimentally, not $operatornameR2$-sortable either for commonly-used graph families.
arXiv Detail & Related papers (2024-06-17T14:52:21Z)
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That? [60.50127555651554]
Large Language Models (LLMs) show impressive results in numerous practical applications, but they lack essential safety features. This makes them vulnerable to manipulations such as indirect prompt injections and generally unsuitable for safety-critical tasks. We introduce a formal measure for instruction-data separation and an empirical variant that is calculable from a model's outputs.
arXiv Detail & Related papers (2024-03-11T15:48:56Z)
iSCAN: Identifying Causal Mechanism Shifts among Nonlinear Additive Noise Models [48.33685559041322]
This paper focuses on identifying the causal mechanism shifts in two or more related datasets over the same set of variables. Code implementing the proposed method is open-source and publicly available at https://github.com/kevinsbello/iSCAN.
arXiv Detail & Related papers (2023-06-30T01:48:11Z)
Representation Disentaglement via Regularization by Causal Identification [3.9160947065896803]
We propose the use of a causal collider structured model to describe the underlying data generative process assumptions in disentangled representation learning. For this, we propose regularization by identification (ReI), a modular regularization engine designed to align the behavior of large scale generative models with the disentanglement constraints imposed by causal identification.
arXiv Detail & Related papers (2023-02-28T23:18:54Z)
Variable Importance Matching for Causal Inference [73.25504313552516]
We describe a general framework called Model-to-Match that achieves these goals. Model-to-Match uses variable importance measurements to construct a distance metric. We operationalize the Model-to-Match framework with LASSO.
arXiv Detail & Related papers (2023-02-23T00:43:03Z)
Learning Latent Structural Causal Models [31.686049664958457]
In machine learning tasks, one often operates on low-level data like image pixels or high-dimensional vectors. We present a tractable approximate inference method which performs joint inference over the causal variables, structure and parameters of the latent Structural Causal Model.
arXiv Detail & Related papers (2022-10-24T20:09:44Z)
An evaluation framework for comparing causal inference models [3.1372269816123994]
We use the proposed evaluation methodology to compare several state-of-the-art causal effect estimation models. The main motivation behind this approach is the elimination of the influence of a small number of instances or simulation on the benchmarking process.
arXiv Detail & Related papers (2022-08-31T21:04:20Z)
Evaluating Causal Inference Methods [0.4588028371034407]
We introduce a deep generative model-based framework, Credence, to validate causal inference methods. Our work introduces a deep generative model-based framework, Credence, to validate causal inference methods.
arXiv Detail & Related papers (2022-02-09T00:21:22Z)
Estimation of Bivariate Structural Causal Models by Variational Gaussian Process Regression Under Likelihoods Parametrised by Normalising Flows [74.85071867225533]
Causal mechanisms can be described by structural causal models. One major drawback of state-of-the-art artificial intelligence is its lack of explainability.
arXiv Detail & Related papers (2021-09-06T14:52:58Z)
A Subsampling-Based Method for Causal Discovery on Discrete Data [18.35147325731821]
In this work, we propose a subsampling-based method to test the independence between the generating schemes of the cause and that of the mechanism. Our methodology works for both discrete and categorical data and does not imply any functional model on the data, making it a more flexible approach.
arXiv Detail & Related papers (2021-08-31T17:11:58Z)
Harmonization with Flow-based Causal Inference [12.739380441313022]
This paper presents a normalizing-flow-based method to perform counterfactual inference upon a structural causal model (SCM) to harmonize medical data. We evaluate on multiple, large, real-world medical datasets to observe that this method leads to better cross-domain generalization compared to state-of-the-art algorithms.
arXiv Detail & Related papers (2021-06-12T19:57:35Z)
Selecting Treatment Effects Models for Domain Adaptation Using Causal Knowledge [82.5462771088607]
We propose a novel model selection metric specifically designed for ITE methods under the unsupervised domain adaptation setting. In particular, we propose selecting models whose predictions of interventions' effects satisfy known causal structures in the target domain.
arXiv Detail & Related papers (2021-02-11T21:03:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.