Related papers: Can LLMs Leverage Observational Data? Towards Data-Driven Causal Discovery with LLMs

Can LLMs Leverage Observational Data? Towards Data-Driven Causal Discovery with LLMs

URL: http://arxiv.org/abs/2504.10936v1
Date: Tue, 15 Apr 2025 07:32:35 GMT
Title: Can LLMs Leverage Observational Data? Towards Data-Driven Causal Discovery with LLMs
Authors: Yuni Susanti, Michael Färber,
Abstract summary: Causal discovery traditionally relies on statistical methods applied to observational data.<n>Recent advancements in Large Language Models (LLMs) have introduced new possibilities for causal discovery.
Score: 10.573861741540853
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Causal discovery traditionally relies on statistical methods applied to observational data, often requiring large datasets and assumptions about underlying causal structures. Recent advancements in Large Language Models (LLMs) have introduced new possibilities for causal discovery by providing domain expert knowledge. However, it remains unclear whether LLMs can effectively process observational data for causal discovery. In this work, we explore the potential of LLMs for data-driven causal discovery by integrating observational data for LLM-based reasoning. Specifically, we examine whether LLMs can effectively utilize observational data through two prompting strategies: pairwise prompting and breadth first search (BFS)-based prompting. In both approaches, we incorporate the observational data directly into the prompt to assess LLMs' ability to infer causal relationships from such data. Experiments on benchmark datasets show that incorporating observational data enhances causal discovery, boosting F1 scores by up to 0.11 point using both pairwise and BFS LLM-based prompting, while outperforming traditional statistical causal discovery baseline by up to 0.52 points. Our findings highlight the potential and limitations of LLMs for data-driven causal discovery, demonstrating their ability to move beyond textual metadata and effectively interpret and utilize observational data for more informed causal reasoning. Our studies lays the groundwork for future advancements toward fully LLM-driven causal discovery.

Related papers

CARE: Turning LLMs Into Causal Reasoning Expert [14.390784501489733]
We show that large language models (LLMs) lack the ability to identify causal relationships.<n>We propose CARE, a framework that enhances LLMs' causal-reasoning ability.
arXiv Detail & Related papers (2025-11-20T03:34:16Z)
Realizing LLMs' Causal Potential Requires Science-Grounded, Novel Benchmarks [20.409472830397455]
Recent claims of strong performance by Large Language Models (LLMs) on causal discovery are undermined by a key flaw: many evaluations rely on benchmarks likely included in pretraining corpora.<n>We challenge this narrative by asking: Do LLMs truly reason about causal structure, and how can we measure it without memorization concerns?<n>We argue that realizing LLMs' potential for causal analysis requires two shifts: (P.1) developing robust evaluation protocols based on recent scientific studies to guard against dataset leakage, and (P.2) designing hybrid methods that combine LLM-derived knowledge with data-driven statistics.
arXiv Detail & Related papers (2025-10-18T14:58:04Z)
Pushing LLMs to Their Logical Reasoning Bound: The Role of Data Reasoning Intensity [59.27594125465172]
We introduce Data Reasoning Intensity (DRI), a novel metric that quantifies the latent logical reasoning complexity of samples.<n>We then introduce a re-cognizing optimization strategy that systematically enhances the logical reasoning intensity of training data.
arXiv Detail & Related papers (2025-09-29T14:20:04Z)
Mitigating Hidden Confounding by Progressive Confounder Imputation via Large Language Models [46.92706900119399]
We make the first attempt to mitigate hidden confounding using large language models (LLMs)<n>We propose ProCI, a framework that elicits the semantic and world knowledge of LLMs to iteratively generate, impute, and validate hidden confounders.<n>Extensive experiments demonstrate that ProCI uncovers meaningful confounders and significantly improves treatment effect estimation.
arXiv Detail & Related papers (2025-06-26T03:49:13Z)
Aligning Language Models with Observational Data: Opportunities and Risks from a Causal Perspective [0.0]
We study the challenges and opportunities of fine-tuning large language models using observational data.<n>We show that while observational outcomes can provide valuable supervision, directly fine-tuning models on such data can lead them to learn spurious correlations.<n>We propose DeconfoundLM, a method that explicitly removes the effect of known confounders from reward signals.
arXiv Detail & Related papers (2025-05-30T18:44:09Z)
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning [87.30285670315334]
textbfR1-Searcher is a novel two-stage outcome-based RL approach designed to enhance the search capabilities of Large Language Models. Our framework relies exclusively on RL, without requiring process rewards or distillation for a cold start. Our experiments demonstrate that our method significantly outperforms previous strong RAG methods, even when compared to the closed-source GPT-4o-mini.
arXiv Detail & Related papers (2025-03-07T17:14:44Z)
Can Large Language Models Help Experimental Design for Causal Discovery? [94.66802142727883]
Large Language Model Guided Intervention Targeting (LeGIT) is a robust framework that effectively incorporates LLMs to augment existing numerical approaches for the intervention targeting in causal discovery.<n>LeGIT demonstrates significant improvements and robustness over existing methods and even surpasses humans.
arXiv Detail & Related papers (2025-03-03T03:43:05Z)
From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks? [51.42906577386907]
This study explores the factors influencing the performance of Large Language Models (LLMs) in causal discovery tasks. A higher frequency of causal mentions correlates with better model performance, suggesting that extensive exposure to causal information during training enhances the models' causal discovery capabilities.
arXiv Detail & Related papers (2024-07-29T01:45:05Z)
Enhancing Temporal Understanding in LLMs for Semi-structured Tables [50.59009084277447]
We conduct a comprehensive analysis of temporal datasets to pinpoint the specific limitations of large language models (LLMs) Our investigation leads to enhancements in TempTabQA, a dataset specifically designed for temporal temporal question answering. We introduce a novel approach, C.L.E.A.R. to strengthen LLM capabilities in this domain.
arXiv Detail & Related papers (2024-07-22T20:13:10Z)
Anomaly Detection of Tabular Data Using LLMs [54.470648484612866]
We show that pre-trained large language models (LLMs) are zero-shot batch-level anomaly detectors. We propose an end-to-end fine-tuning strategy to bring out the potential of LLMs in detecting real anomalies.
arXiv Detail & Related papers (2024-06-24T04:17:03Z)
Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models [23.438388321411693]
Causal graph recovery is traditionally done using statistical estimation-based methods or based on individual's knowledge about variables of interests. We propose a novel method that leverages large language models (LLMs) to deduce causal relationships in general causal graph recovery tasks.
arXiv Detail & Related papers (2024-02-23T13:02:10Z)
Discovery of the Hidden World with Large Language Models [95.58823685009727]
This paper presents Causal representatiOn AssistanT (COAT) that introduces large language models (LLMs) to bridge the gap. LLMs are trained on massive observations of the world and have demonstrated great capability in extracting key information from unstructured data. COAT also adopts CDs to find causal relations among the identified variables as well as to provide feedback to LLMs to iteratively refine the proposed factors.
arXiv Detail & Related papers (2024-02-06T12:18:54Z)
Can We Utilize Pre-trained Language Models within Causal Discovery Algorithms? [0.2303687191203919]
Causal reasoning of Pre-trained Language Models (PLMs) relies solely on text-based descriptions. We propose a new framework that integrates prior knowledge obtained from PLM with a causal discovery algorithm.
arXiv Detail & Related papers (2023-11-19T03:31:30Z)
Can Large Language Models Infer Causation from Correlation? [104.96351414570239]
We test the pure causal inference skills of large language models (LLMs) We formulate a novel task Corr2Cause, which takes a set of correlational statements and determines the causal relationship between the variables. We show that these models achieve almost close to random performance on the task.
arXiv Detail & Related papers (2023-06-09T12:09:15Z)
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality [29.433401785920065]
Large language models (LLMs) can generate causal arguments with high probability. LLMs may be used by human domain experts to save effort in setting up a causal analysis.
arXiv Detail & Related papers (2023-04-28T19:00:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.