Related papers: Context-Aware Reasoning On Parametric Knowledge for Inferring Causal Variables

Context-Aware Reasoning On Parametric Knowledge for Inferring Causal Variables

URL: http://arxiv.org/abs/2409.02604v2
Date: Thu, 25 Sep 2025 08:38:43 GMT
Title: Context-Aware Reasoning On Parametric Knowledge for Inferring Causal Variables
Authors: Ivaxi Sheth, Sahar Abdelnabi, Mario Fritz,
Abstract summary: We introduce a novel benchmark where the objective is to complete a partial causal graph.<n>We show the strong ability of LLMs to hypothesize the backdoor variables between a cause and its effect.<n>Unlike simple memorization of fixed associations, our task requires the LLM to reason according to the context of the entire graph.
Score: 49.31233968546582
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Scientific discovery catalyzes human intellectual advances, driven by the cycle of hypothesis generation, experimental design, evaluation, and assumption refinement. Central to this process is causal inference, uncovering the mechanisms behind observed phenomena. While randomized experiments provide strong inferences, they are often infeasible due to ethical or practical constraints. However, observational studies are prone to confounding or mediating biases. While crucial, identifying such backdoor paths is expensive and heavily depends on scientists' domain knowledge to generate hypotheses. We introduce a novel benchmark where the objective is to complete a partial causal graph. We design a benchmark with varying difficulty levels with over 4000 queries. We show the strong ability of LLMs to hypothesize the backdoor variables between a cause and its effect. Unlike simple knowledge memorization of fixed associations, our task requires the LLM to reason according to the context of the entire graph.

Related papers

CausalFlip: A Benchmark for LLM Causal Judgment Beyond Semantic Matching [50.65932158912512]
We propose a new causal reasoning benchmark, CausalFlip, to encourage the development of new large language models.<n>CaulFlip consists of causal judgment questions built over event triples that could form different confounder, chain, and collider relations.<n>We evaluate LLMs under multiple training paradigms, including answer-only training, explicit Chain-of-Thought supervision, and a proposed internalized causal reasoning approach.
arXiv Detail & Related papers (2026-02-23T18:06:15Z)
Consistency Is Not Always Correct: Towards Understanding the Role of Exploration in Post-Training Reasoning [75.79451512757844]
Foundation models exhibit broad knowledge but limited task-specific reasoning.<n> RLVR and inference scaling motivate post-training strategies such as RLVR and inference scaling.<n>We show that RLVR induces a squeezing effect, reducing reasoning entropy and forgetting some correct paths.
arXiv Detail & Related papers (2025-11-10T18:25:26Z)
Failure Modes of LLMs for Causal Reasoning on Narratives [51.19592551510628]
We investigate the interaction between world knowledge and logical reasoning.<n>We find that state-of-the-art large language models (LLMs) often rely on superficial generalizations.<n>We show that simple reformulations of the task can elicit more robust reasoning behavior.
arXiv Detail & Related papers (2024-10-31T12:48:58Z)
Causal Representation Learning in Temporal Data via Single-Parent Decoding [66.34294989334728]
Scientific research often seeks to understand the causal structure underlying high-level variables in a system. Scientists typically collect low-level measurements, such as geographically distributed temperature readings. We propose a differentiable method, Causal Discovery with Single-parent Decoding, that simultaneously learns the underlying latents and a causal graph over them.
arXiv Detail & Related papers (2024-10-09T15:57:50Z)
Smoke and Mirrors in Causal Downstream Tasks [59.90654397037007]
This paper looks at the causal inference task of treatment effect estimation, where the outcome of interest is recorded in high-dimensional observations. We compare 6 480 models fine-tuned from state-of-the-art visual backbones, and find that the sampling and modeling choices significantly affect the accuracy of the causal estimate. Our results suggest that future benchmarks should carefully consider real downstream scientific questions, especially causal ones.
arXiv Detail & Related papers (2024-05-27T13:26:34Z)
How Likely Do LLMs with CoT Mimic Human Reasoning? [31.86489714330338]
Chain-of-thought emerges as a promising technique for eliciting reasoning capabilities from Large Language Models (LLMs)<n>We use causal analysis to understand the relationships between the problem instruction, reasoning, and the answer in LLMs.
arXiv Detail & Related papers (2024-02-25T10:13:04Z)
Identifiable Latent Polynomial Causal Models Through the Lens of Change [82.14087963690561]
Causal representation learning aims to unveil latent high-level causal representations from observed low-level data. One of its primary tasks is to provide reliable assurance of identifying these latent causal models, known as identifiability.
arXiv Detail & Related papers (2023-10-24T07:46:10Z)
Nonlinearity, Feedback and Uniform Consistency in Causal Structural Learning [0.8158530638728501]
Causal Discovery aims to find automated search methods for learning causal structures from observational data. This thesis focuses on two questions in causal discovery: (i) providing an alternative definition of k-Triangle Faithfulness that (i) is weaker than strong faithfulness when applied to the Gaussian family of distributions, and (ii) under the assumption that the modified version of Strong Faithfulness holds.
arXiv Detail & Related papers (2023-08-15T01:23:42Z)
A Causal Framework for Decomposing Spurious Variations [68.12191782657437]
We develop tools for decomposing spurious variations in Markovian and Semi-Markovian models. We prove the first results that allow a non-parametric decomposition of spurious effects. The described approach has several applications, ranging from explainable and fair AI to questions in epidemiology and medicine.
arXiv Detail & Related papers (2023-06-08T09:40:28Z)
Decoding Causality by Fictitious VAR Modeling [0.0]
We first set up an equilibrium for the cause-effect relations using a fictitious vector autoregressive model. In the equilibrium, long-run relations are identified from noise, and spurious ones are negligibly close to zero. We also apply the approach to estimating the causal factors' contribution to climate change.
arXiv Detail & Related papers (2021-11-14T22:43:02Z)
Causal Discovery in Linear Structural Causal Models with Deterministic Relations [27.06618125828978]
We focus on the task of causal discovery form observational data. We derive a set of necessary and sufficient conditions for unique identifiability of the causal structure.
arXiv Detail & Related papers (2021-10-30T21:32:42Z)
Systematic Evaluation of Causal Discovery in Visual Model Based Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure. Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs. In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z)
To do or not to do: finding causal relations in smart homes [2.064612766965483]
This paper introduces a new way to learn causal models from a mixture of experiments on the environment and observational data. The core of our method is the use of selected interventions, especially our learning takes into account the variables where it is impossible to intervene. We use our method on a smart home simulation, a use case where knowing causal relations pave the way towards explainable systems.
arXiv Detail & Related papers (2021-05-20T22:36:04Z)
A Critical Look At The Identifiability of Causal Effects with Deep Latent Variable Models [2.326384409283334]
We use causal effect variational autoencoder (CEVAE) as a case study. CEVAE seems to work reliably under some simple scenarios, but it does not identify the correct causal effect with a misspecified latent variable or a complex data distribution. Our results show that the question of identifiability cannot be disregarded, and we argue that more attention should be paid to it in future work.
arXiv Detail & Related papers (2021-02-12T17:43:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.