Related papers: Zero-shot Causal Graph Extrapolation from Text via LLMs

Related papers

Linear Reasoning vs. Proof by Cases: Obstacles for Large Language Models in FOL Problem Solving [11.939133563702066]
We introduce a novel first-order logic (FOL) dataset named PC-FOL, annotated by professional mathematicians.<n>All instances in this dataset are equipped with a manually written natural language proof, clearly distinguishing it from conventional linear reasoning datasets.
arXiv Detail & Related papers (2026-02-24T14:53:34Z)
Hallucination Detection in LLMs via Topological Divergence on Attention Graphs [64.74977204942199]
Hallucination, i.e., generating factually incorrect content, remains a critical challenge for large language models. We introduce TOHA, a TOpology-based HAllucination detector in the RAG setting.
arXiv Detail & Related papers (2025-04-14T10:06:27Z)
ExpliCa: Evaluating Explicit Causal Reasoning in Large Language Models [75.05436691700572]
We introduce ExpliCa, a new dataset for evaluating Large Language Models (LLMs) in explicit causal reasoning. We tested seven commercial and open-source LLMs on ExpliCa through prompting and perplexity-based metrics. Surprisingly, models tend to confound temporal relations with causal ones, and their performance is also strongly influenced by the linguistic order of the events.
arXiv Detail & Related papers (2025-02-21T14:23:14Z)
JustLogic: A Comprehensive Benchmark for Evaluating Deductive Reasoning in Large Language Models [51.99046112135311]
We introduce JustLogic, a synthetically generated deductive reasoning benchmark for rigorous evaluation of Large Language Models. JustLogic is highly complex, capable of generating a diverse range of linguistic patterns, vocabulary, and argument structures. Our experimental results reveal that most state-of-the-art (SOTA) LLMs perform significantly worse than the human average.
arXiv Detail & Related papers (2025-01-24T15:49:10Z)
Argumentation Computation with Large Language Models : A Benchmark Study [6.0682923348298194]
Large language models (LLMs) have made significant advancements in neuro-symbolic computing. We aim to investigate the capability of LLMs in determining the extensions of various abstract argumentation semantics.
arXiv Detail & Related papers (2024-12-21T18:23:06Z)
Counterfactual Causal Inference in Natural Language with Large Language Models [9.153187514369849]
We propose an end-to-end causal structure discovery and causal inference method from natural language. We first use an LLM to extract the instantiated causal variables from text data and build a causal graph. We then conduct counterfactual inference on the estimated graph.
arXiv Detail & Related papers (2024-10-08T21:53:07Z)
Evaluating the Correctness of Inference Patterns Used by LLMs for Judgment [53.17596274334017]
We evaluate the correctness of the detailed inference patterns of an LLM behind its seemingly correct outputs.<n>Experiments show that even when the language generation results appear correct, a significant portion of the inference patterns used by the LLM for the legal judgment may represent misleading or irrelevant logic.
arXiv Detail & Related papers (2024-10-06T08:33:39Z)
Enhancing Logical Reasoning in Large Language Models through Graph-based Synthetic Data [53.433309883370974]
This work explores the potential and limitations of using graph-based synthetic reasoning data as training signals to enhance Large Language Models' reasoning capabilities. Our experiments, conducted on two established natural language reasoning tasks, demonstrate that supervised fine-tuning with synthetic graph-based reasoning data effectively enhances LLMs' reasoning performance without compromising their effectiveness on other standard evaluation benchmarks.
arXiv Detail & Related papers (2024-09-19T03:39:09Z)
Prompt-based vs. Fine-tuned LLMs Toward Causal Graph Verification [0.0]
This work aims toward an application of natural language processing (NLP) technology for automatic verification of causal graphs using text sources. We compare the performance of two types of NLP models: (1) pre-trained language models fine-tuned for causal relation classification task and, (2) prompt-based LLMs.
arXiv Detail & Related papers (2024-05-29T09:06:18Z)
The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks [3.848607479075651]
We investigate the role that current Large Language Models (LLMs) can play in improving callgraph analysis and type inference for Python programs. Our study reveals that LLMs show promising results in type inference, demonstrating higher accuracy than traditional methods, yet they exhibit limitations in callgraph analysis.
arXiv Detail & Related papers (2024-02-27T16:53:53Z)
CLadder: Assessing Causal Reasoning in Language Models [82.8719238178569]
We investigate whether large language models (LLMs) can coherently reason about causality. We propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al.
arXiv Detail & Related papers (2023-12-07T15:12:12Z)
Causal Inference Using LLM-Guided Discovery [34.040996887499425]
We show that the topological order over graph variables (causal order) alone suffices for causal effect inference. We propose a robust technique of obtaining causal order from Large Language Models (LLMs) Our approach significantly improves causal ordering accuracy as compared to discovery algorithms.
arXiv Detail & Related papers (2023-10-23T17:23:56Z)
Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models [59.89454513692417]
Tabular data is often hidden in text, particularly in medical diagnostic reports. We propose a novel, simple, and effective methodology for extracting structured tabular data from textual medical reports, called TEMED-LLM. We demonstrate that our approach significantly outperforms state-of-the-art text classification models in medical diagnostics.
arXiv Detail & Related papers (2023-06-08T09:12:28Z)
Evaluating statistical language models as pragmatic reasoners [39.72348730045737]
We evaluate the capacity of large language models to infer meanings of pragmatic utterances. We find that LLMs can derive context-grounded, human-like distributions over the interpretations of several complex pragmatic utterances. Results inform the inferential capacity of statistical language models, and their use in pragmatic and semantic parsing applications.
arXiv Detail & Related papers (2023-05-01T18:22:10Z)
Can large language models build causal graphs? [54.74910640970968]
Large language models (LLMs) represent an opportunity to ease the process of building causal graphs. LLMs have been shown to be brittle to the choice of probing words, context, and prompts that the user employs.
arXiv Detail & Related papers (2023-03-07T22:05:31Z)
Did the Cat Drink the Coffee? Challenging Transformers with Generalized Event Knowledge [59.22170796793179]
Transformers Language Models (TLMs) were tested on a benchmark for the textitdynamic estimation of thematic fit Our results show that TLMs can reach performances that are comparable to those achieved by SDM. However, additional analysis consistently suggests that TLMs do not capture important aspects of event knowledge.
arXiv Detail & Related papers (2021-07-22T20:52:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.