Do Large Language Models Reason Causally Like Us? Even Better?
- URL: http://arxiv.org/abs/2502.10215v1
- Date: Fri, 14 Feb 2025 15:09:15 GMT
- Title: Do Large Language Models Reason Causally Like Us? Even Better?
- Authors: Hanna M. Dettki, Brenden M. Lake, Charley M. Wu, Bob Rehder,
- Abstract summary: Large language models (LLMs) have shown impressive capabilities in generating human-like text.
We compare causal reasoning in humans and four LLMs using tasks based on collider graphs.
We find that LLMs reason causally along a spectrum from human-like to normative inference, with alignment shifting based on model, context, and task.
- Score: 7.749713014052951
- License:
- Abstract: Causal reasoning is a core component of intelligence. Large language models (LLMs) have shown impressive capabilities in generating human-like text, raising questions about whether their responses reflect true understanding or statistical patterns. We compared causal reasoning in humans and four LLMs using tasks based on collider graphs, rating the likelihood of a query variable occurring given evidence from other variables. We find that LLMs reason causally along a spectrum from human-like to normative inference, with alignment shifting based on model, context, and task. Overall, GPT-4o and Claude showed the most normative behavior, including "explaining away", whereas Gemini-Pro and GPT-3.5 did not. Although all agents deviated from the expected independence of causes - Claude the least - they exhibited strong associative reasoning and predictive inference when assessing the likelihood of the effect given its causes. These findings underscore the need to assess AI biases as they increasingly assist human decision-making.
Related papers
- Do Large Language Models Show Biases in Causal Learning? [3.0264418764647605]
Causal learning is the cognitive process of developing the capability of making causal inferences based on available information.
This research investigates whether large language models (LLMs) develop causal illusions.
arXiv Detail & Related papers (2024-12-13T19:03:48Z) - COLD: Causal reasOning in cLosed Daily activities [7.782872276680731]
We propose the COLD (Causal reasOning in cLosed Daily activities) framework.
It is built upon human understanding of daily real-world activities to reason about the causal nature of events.
We show that the proposed framework facilitates the creation of enormous causal queries.
arXiv Detail & Related papers (2024-11-29T06:37:13Z) - Failure Modes of LLMs for Causal Reasoning on Narratives [51.19592551510628]
We investigate the causal reasoning abilities of large language models (LLMs) through the representative problem of inferring causal relationships from narratives.
We find that even state-of-the-art language models rely on unreliable shortcuts, both in terms of the narrative presentation and their parametric knowledge.
arXiv Detail & Related papers (2024-10-31T12:48:58Z) - Are UFOs Driving Innovation? The Illusion of Causality in Large Language Models [0.0]
This research investigates whether large language models develop the illusion of causality in real-world settings.
We evaluated and compared news headlines generated by GPT-4o-Mini, Claude-3.5-Sonnet, and Gemini-1.5-Pro.
We found that Claude-3.5-Sonnet is the model that presents the lowest degree of causal illusion aligned with experiments on Correlation-to-Causation Exaggeration.
arXiv Detail & Related papers (2024-10-15T15:20:49Z) - Hypothesizing Missing Causal Variables with LLMs [55.28678224020973]
We formulate a novel task where the input is a partial causal graph with missing variables, and the output is a hypothesis about the missing variables to complete the partial graph.
We show the strong ability of LLMs to hypothesize the mediation variables between a cause and its effect.
We also observe surprising results where some of the open-source models outperform the closed GPT-4 model.
arXiv Detail & Related papers (2024-09-04T10:37:44Z) - CLadder: Assessing Causal Reasoning in Language Models [82.8719238178569]
We investigate whether large language models (LLMs) can coherently reason about causality.
We propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al.
arXiv Detail & Related papers (2023-12-07T15:12:12Z) - MoCa: Measuring Human-Language Model Alignment on Causal and Moral
Judgment Tasks [49.60689355674541]
A rich literature in cognitive science has studied people's causal and moral intuitions.
This work has revealed a number of factors that systematically influence people's judgments.
We test whether large language models (LLMs) make causal and moral judgments about text-based scenarios that align with human participants.
arXiv Detail & Related papers (2023-10-30T15:57:32Z) - Visual Abductive Reasoning [85.17040703205608]
Abductive reasoning seeks the likeliest possible explanation for partial observations.
We propose a new task and dataset, Visual Abductive Reasoning ( VAR), for examining abductive reasoning ability of machine intelligence in everyday visual situations.
arXiv Detail & Related papers (2022-03-26T10:17:03Z) - Systematic Evaluation of Causal Discovery in Visual Model Based
Reinforcement Learning [76.00395335702572]
A central goal for AI and causality is the joint discovery of abstract representations and causal structure.
Existing environments for studying causal induction are poorly suited for this objective because they have complicated task-specific causal graphs.
In this work, our goal is to facilitate research in learning representations of high-level variables as well as causal structures among them.
arXiv Detail & Related papers (2021-07-02T05:44:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.