Related papers: Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models

URL: http://arxiv.org/abs/2402.15301v2
Date: Tue, 18 Jun 2024 05:51:50 GMT
Title: Causal Graph Discovery with Retrieval-Augmented Generation based Large Language Models
Authors: Yuzhe Zhang, Yipeng Zhang, Yidong Gan, Lina Yao, Chen Wang,
Abstract summary: Causal graph recovery is traditionally done using statistical estimation-based methods or based on individual's knowledge about variables of interests. We propose a novel method that leverages large language models (LLMs) to deduce causal relationships in general causal graph recovery tasks.
Score: 23.438388321411693
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Causal graph recovery is traditionally done using statistical estimation-based methods or based on individual's knowledge about variables of interests. They often suffer from data collection biases and limitations of individuals' knowledge. The advance of large language models (LLMs) provides opportunities to address these problems. We propose a novel method that leverages LLMs to deduce causal relationships in general causal graph recovery tasks. This method leverages knowledge compressed in LLMs and knowledge LLMs extracted from scientific publication database as well as experiment data about factors of interest to achieve this goal. Our method gives a prompting strategy to extract associational relationships among those factors and a mechanism to perform causality verification for these associations. Comparing to other LLM-based methods that directly instruct LLMs to do the highly complex causal reasoning, our method shows clear advantage on causal graph quality on benchmark datasets. More importantly, as causality among some factors may change as new research results emerge, our method show sensitivity to new evidence in the literature and can provide useful information for updating causal graphs accordingly.

Related papers

Paths to Causality: Finding Informative Subgraphs Within Knowledge Graphs for Knowledge-Based Causal Discovery [10.573861741540853]
We introduce a novel approach that integrates Knowledge Graphs (KGs) with Large Language Models (LLMs) to enhance knowledge-based causal discovery.<n>Our approach identifies informative metapath-based subgraphs within KGs and further refines the selection of these subgraphs using Learning-to-Rank-based models.<n>Our method outperforms most baselines by up to 44.4 points in F1 scores, evaluated across diverse LLMs and KGs.
arXiv Detail & Related papers (2025-06-10T13:13:55Z)
Fairness-Driven LLM-based Causal Discovery with Active Learning and Dynamic Scoring [1.5498930424110338]
Causal discovery (CD) plays a pivotal role in numerous scientific fields by clarifying the causal relationships that underlie phenomena observed in diverse disciplines. Despite significant advancements in CD algorithms, their application faces challenges due to the high computational demands and complexities of large-scale data. This paper introduces a framework that leverages Large Language Models (LLMs) for CD, utilizing a metadata-based approach akin to the reasoning processes of human experts.
arXiv Detail & Related papers (2025-03-21T22:58:26Z)
Can Large Language Models Help Experimental Design for Causal Discovery? [94.66802142727883]
Large Language Model Guided Intervention Targeting (LeGIT) is a robust framework that effectively incorporates LLMs to augment existing numerical approaches for the intervention targeting in causal discovery. LeGIT demonstrates significant improvements and robustness over existing methods and even surpasses humans.
arXiv Detail & Related papers (2025-03-03T03:43:05Z)
Preference Leakage: A Contamination Problem in LLM-as-a-judge [69.96778498636071]
Large Language Models (LLMs) as judges and LLM-based data synthesis have emerged as two fundamental LLM-driven data annotation methods. In this work, we expose preference leakage, a contamination problem in LLM-as-a-judge caused by the relatedness between the synthetic data generators and LLM-based evaluators.
arXiv Detail & Related papers (2025-02-03T17:13:03Z)
Discovery of Maximally Consistent Causal Orders with Large Language Models [0.8192907805418583]
Causal discovery is essential for understanding complex systems. Traditional methods often rely on strong, untestable assumptions. We propose a novel method to derive a class of acyclic tournaments.
arXiv Detail & Related papers (2024-12-18T16:37:51Z)
Counterfactual Causal Inference in Natural Language with Large Language Models [9.153187514369849]
We propose an end-to-end causal structure discovery and causal inference method from natural language. We first use an LLM to extract the instantiated causal variables from text data and build a causal graph. We then conduct counterfactual inference on the estimated graph.
arXiv Detail & Related papers (2024-10-08T21:53:07Z)
From Pre-training Corpora to Large Language Models: What Factors Influence LLM Performance in Causal Discovery Tasks? [51.42906577386907]
This study explores the factors influencing the performance of Large Language Models (LLMs) in causal discovery tasks. A higher frequency of causal mentions correlates with better model performance, suggesting that extensive exposure to causal information during training enhances the models' causal discovery capabilities.
arXiv Detail & Related papers (2024-07-29T01:45:05Z)
ALCM: Autonomous LLM-Augmented Causal Discovery Framework [2.1470800327528843]
We introduce a new framework, named Autonomous LLM-Augmented Causal Discovery Framework (ALCM), to synergize data-driven causal discovery algorithms and Large Language Models. The ALCM consists of three integral components: causal structure learning, causal wrapper, and LLM-driven causal refiner. We evaluate the ALCM framework by implementing two demonstrations on seven well-known datasets.
arXiv Detail & Related papers (2024-05-02T21:27:45Z)
CausalBench: A Comprehensive Benchmark for Causal Learning Capability of LLMs [27.362012903540492]
The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning. The ability to understand causality significantly impacts the competence of large language models (LLMs) in output explanation and counterfactual reasoning.
arXiv Detail & Related papers (2024-04-09T14:40:08Z)
ExaRanker-Open: Synthetic Explanation for IR using Open-Source LLMs [60.81649785463651]
We introduce ExaRanker-Open, where we adapt and explore the use of open-source language models to generate explanations. Our findings reveal that incorporating explanations consistently enhances neural rankers, with benefits escalating as the LLM size increases.
arXiv Detail & Related papers (2024-02-09T11:23:14Z)
Multi-modal Causal Structure Learning and Root Cause Analysis [67.67578590390907]
We propose Mulan, a unified multi-modal causal structure learning method for root cause localization. We leverage a log-tailored language model to facilitate log representation learning, converting log sequences into time-series data. We also introduce a novel key performance indicator-aware attention mechanism for assessing modality reliability and co-learning a final causal graph.
arXiv Detail & Related papers (2024-02-04T05:50:38Z)
Zero-shot Causal Graph Extrapolation from Text via LLMs [50.596179963913045]
We evaluate the ability of large language models (LLMs) to infer causal relations from natural language. LLMs show competitive performance in a benchmark of pairwise relations without needing (explicit) training samples. We extend our approach to extrapolating causal graphs through iterated pairwise queries.
arXiv Detail & Related papers (2023-12-22T13:14:38Z)
Mitigating Large Language Model Hallucinations via Autonomous Knowledge Graph-based Retrofitting [51.7049140329611]
This paper proposes Knowledge Graph-based Retrofitting (KGR) to mitigate factual hallucination during the reasoning process. Experiments show that KGR can significantly improve the performance of LLMs on factual QA benchmarks.
arXiv Detail & Related papers (2023-11-22T11:08:38Z)
From Query Tools to Causal Architects: Harnessing Large Language Models for Advanced Causal Discovery from Data [19.264745484010106]
Large Language Models (LLMs) exhibit exceptional abilities for causal analysis between concepts in numerous societally impactful domains. Recent research on LLM performance in various causal discovery and inference tasks has given rise to a new ladder in the classical three-stage framework of causality. We propose a novel framework that combines knowledge-based LLM causal analysis with data-driven causal structure learning.
arXiv Detail & Related papers (2023-06-29T12:48:00Z)
Can Large Language Models Infer Causation from Correlation? [104.96351414570239]
We test the pure causal inference skills of large language models (LLMs) We formulate a novel task Corr2Cause, which takes a set of correlational statements and determines the causal relationship between the variables. We show that these models achieve almost close to random performance on the task.
arXiv Detail & Related papers (2023-06-09T12:09:15Z)
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality [29.433401785920065]
Large language models (LLMs) can generate causal arguments with high probability. LLMs may be used by human domain experts to save effort in setting up a causal analysis.
arXiv Detail & Related papers (2023-04-28T19:00:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.