Related papers: COLD: Causal reasOning in cLosed Daily activities

COLD: Causal reasOning in cLosed Daily activities

URL: http://arxiv.org/abs/2411.19500v1
Date: Fri, 29 Nov 2024 06:37:13 GMT
Title: COLD: Causal reasOning in cLosed Daily activities
Authors: Abhinav Joshi, Areeb Ahmad, Ashutosh Modi,
Abstract summary: We propose the COLD (Causal reasOning in cLosed Daily activities) framework.<n>It is built upon human understanding of daily real-world activities to reason about the causal nature of events.<n>We show that the proposed framework facilitates the creation of enormous causal queries.
Score: 7.782872276680731
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Language Models (LLMs) have shown state-of-the-art performance in a variety of tasks, including arithmetic and reasoning; however, to gauge the intellectual capabilities of LLMs, causal reasoning has become a reliable proxy for validating a general understanding of the mechanics and intricacies of the world similar to humans. Previous works in natural language processing (NLP) have either focused on open-ended causal reasoning via causal commonsense reasoning (CCR) or framed a symbolic representation-based question answering for theoretically backed-up analysis via a causal inference engine. The former adds an advantage of real-world grounding but lacks theoretically backed-up analysis/validation, whereas the latter is far from real-world grounding. In this work, we bridge this gap by proposing the COLD (Causal reasOning in cLosed Daily activities) framework, which is built upon human understanding of daily real-world activities to reason about the causal nature of events. We show that the proposed framework facilitates the creation of enormous causal queries (~ 9 million) and comes close to the mini-turing test, simulating causal reasoning to evaluate the understanding of a daily real-world task. We evaluate multiple LLMs on the created causal queries and find that causal reasoning is challenging even for activities trivial to humans. We further explore (the causal reasoning abilities of LLMs) using the backdoor criterion to determine the causal strength between events.

Related papers

Unveiling Causal Reasoning in Large Language Models: Reality or Mirage? [62.17959154852391]
Causal reasoning capability is critical in advancing large language models toward strong artificial intelligence.<n>We show that large language models (LLMs) are only capable of performing shallow (level-1) causal reasoning.<n>We propose G2-Reasoner, a method that incorporates general knowledge and goal-oriented prompts into LLMs' causal reasoning processes.
arXiv Detail & Related papers (2025-06-26T13:11:01Z)
Com$^2$: A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models [40.47361817762135]
Large language models (LLMs) have mastered abundant simple and explicit commonsense knowledge through pre-training.<n>LLMs struggle to reason with complex and implicit commonsense knowledge that is derived from simple ones.<n>We propose a benchmark Com$2$ focusing on complex commonsense reasoning.
arXiv Detail & Related papers (2025-06-08T09:53:08Z)
What's Missing in Vision-Language Models? Probing Their Struggles with Causal Order Reasoning [26.671128120554457]
causal reasoning is fundamental to solving complex high-level reasoning tasks.<n>Existing benchmarks often include a mixture of reasoning questions.<n>We introduce VQA-Causal and VCR-Causal to isolate and rigorously evaluate causal reasoning abilities.
arXiv Detail & Related papers (2025-06-01T07:17:46Z)
Causal Cartographer: From Mapping to Reasoning Over Counterfactual Worlds [9.153187514369849]
Causal world models can answer counterfactual questions about an environment of interest.<n>It requires understanding the underlying causes behind chains of events and conducting causal inference for unseen distributions.<n>We show that our approach can extract causal knowledge while reducing inference costs and spurious correlations.
arXiv Detail & Related papers (2025-05-20T14:14:05Z)
Towards Quantifying Commonsense Reasoning with Mechanistic Insights [7.124379028448955]
We argue that a proxy of commonsense reasoning can be maintained as a graphical structure. We create an annotation scheme for capturing this implicit knowledge in the form of a graphical structure for 37 daily human activities. We find that the created resource can be used to frame an enormous number of commonsense queries.
arXiv Detail & Related papers (2025-04-14T10:21:59Z)
Failure Modes of LLMs for Causal Reasoning on Narratives [51.19592551510628]
We investigate the causal reasoning abilities of large language models (LLMs) through the representative problem of inferring causal relationships from narratives. We find that even state-of-the-art language models rely on unreliable shortcuts, both in terms of the narrative presentation and their parametric knowledge.
arXiv Detail & Related papers (2024-10-31T12:48:58Z)
Language Agents Meet Causality -- Bridging LLMs and Causal World Models [50.79984529172807]
We propose a framework that integrates causal representation learning with large language models. This framework learns a causal world model, with causal variables linked to natural language expressions. We evaluate the framework on causal inference and planning tasks across temporal scales and environmental complexities.
arXiv Detail & Related papers (2024-10-25T18:36:37Z)
Improving Causal Reasoning in Large Language Models: A Survey [16.55801836321059]
Causal reasoning is a crucial aspect of intelligence, essential for problem-solving, decision-making, and understanding the world. Large language models (LLMs) can generate rationales for their outputs, but their ability to reliably perform causal reasoning remains uncertain.
arXiv Detail & Related papers (2024-10-22T04:18:19Z)
Cause and Effect: Can Large Language Models Truly Understand Causality? [1.2334534968968969]
This research proposes a novel architecture called Context Aware Reasoning Enhancement with Counterfactual Analysis(CARE CA) framework. The proposed framework incorporates an explicit causal detection module with ConceptNet and counterfactual statements, as well as implicit causal detection through Large Language Models. The knowledge from ConceptNet enhances the performance of multiple causal reasoning tasks such as causal discovery, causal identification and counterfactual reasoning.
arXiv Detail & Related papers (2024-02-28T08:02:14Z)
How Likely Do LLMs with CoT Mimic Human Reasoning? [31.86489714330338]
Chain-of-thought emerges as a promising technique for eliciting reasoning capabilities from Large Language Models (LLMs) We use causal analysis to understand the relationships between the problem instruction, reasoning, and the answer in LLMs.
arXiv Detail & Related papers (2024-02-25T10:13:04Z)
CLadder: Assessing Causal Reasoning in Language Models [82.8719238178569]
We investigate whether large language models (LLMs) can coherently reason about causality. We propose a new NLP task, causal inference in natural language, inspired by the "causal inference engine" postulated by Judea Pearl et al.
arXiv Detail & Related papers (2023-12-07T15:12:12Z)
Concise and Organized Perception Facilitates Reasoning in Large Language Models [31.238220405009617]
Exploiting large language models (LLMs) to tackle reasoning has garnered growing attention. It still remains highly challenging to achieve satisfactory results in complex logical problems, characterized by plenty of premises within the context and requiring multi-hop reasoning. In this work, we first examine the mechanism from the perspective of information flow and reveal that LLMs confront difficulties akin to human-like cognitive biases when dealing with disordered and irrelevant content in reasoning tasks.
arXiv Detail & Related papers (2023-10-05T04:47:49Z)
Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models [56.34029644009297]
Large language models (LLMs) have demonstrated the ability to overcome various limitations of formal Knowledge Representation (KR) systems. LLMs excel most in abductive reasoning, followed by deductive reasoning, while they are least effective at inductive reasoning. We study single-task training, multi-task training, and "chain-of-thought" knowledge distillation fine-tuning technique to assess the performance of model.
arXiv Detail & Related papers (2023-10-02T01:00:50Z)
Towards CausalGPT: A Multi-Agent Approach for Faithful Knowledge Reasoning via Promoting Causal Consistency in LLMs [60.244412212130264]
Causal-Consistency Chain-of-Thought harnesses multi-agent collaboration to bolster the faithfulness and causality of foundation models. Our framework demonstrates significant superiority over state-of-the-art methods through extensive and comprehensive evaluations.
arXiv Detail & Related papers (2023-08-23T04:59:21Z)
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality [29.433401785920065]
Large language models (LLMs) can generate causal arguments with high probability. LLMs may be used by human domain experts to save effort in setting up a causal analysis.
arXiv Detail & Related papers (2023-04-28T19:00:43Z)
Causal Inference Principles for Reasoning about Commonsense Causality [93.19149325083968]
Commonsense causality reasoning aims at identifying plausible causes and effects in natural language descriptions that are deemed reasonable by an average person. Existing work usually relies on deep language models wholeheartedly, and is potentially susceptible to confounding co-occurrences. Motivated by classical causal principles, we articulate the central question of CCR and draw parallels between human subjects in observational studies and natural languages. We propose a novel framework, ROCK, to Reason O(A)bout Commonsense K(C)ausality, which utilizes temporal signals as incidental supervision.
arXiv Detail & Related papers (2022-01-31T06:12:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.