Related papers: Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

URL: http://arxiv.org/abs/2508.19827v1
Date: Wed, 27 Aug 2025 12:25:29 GMT
Title: Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?
Authors: Samuel Lewis-Lim, Xingwei Tan, Zhixue Zhao, Nikolaos Aletras,
Abstract summary: Chain-of-Thought (CoT) often yields limited gains for soft-reasoning problems.<n>We investigate the dynamics and faithfulness of CoT in soft-reasoning tasks across instruction-tuned, reasoning and reasoning-distilled models.
Score: 32.02698064940949
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent work has demonstrated that Chain-of-Thought (CoT) often yields limited gains for soft-reasoning problems such as analytical and commonsense reasoning. CoT can also be unfaithful to a model's actual reasoning. We investigate the dynamics and faithfulness of CoT in soft-reasoning tasks across instruction-tuned, reasoning and reasoning-distilled models. Our findings reveal differences in how these models rely on CoT, and show that CoT influence and faithfulness are not always aligned.

Related papers

Bypassing the Rationale: Causal Auditing of Implicit Reasoning in Language Models [0.0]
Chain-of-thought (CoT) prompting is widely used as a reasoning aid and is often treated as a transparency mechanism.<n>We introduce a causal, layerwise audit of CoT faithfulness based on activation patching.<n>We find that CoT-specific influence is typically depth-localized into narrow "reasoning windows"
arXiv Detail & Related papers (2026-02-03T20:27:49Z)
AdvChain: Adversarial Chain-of-Thought Tuning for Robust Safety Alignment of Large Reasoning Models [62.70575022567081]
We propose AdvChain, an alignment paradigm that teaches models dynamic self-correction through adversarial CoT tuning.<n>Our work establishes a new direction for building more robust and reliable reasoning models.
arXiv Detail & Related papers (2025-09-29T04:27:23Z)
Deep Hidden Cognition Facilitates Reliable Chain-of-Thought Reasoning [33.30315111732609]
Chain of Thought (CoT) reasoning has demonstrated remarkable deep reasoning capabilities.<n>However, its reliability is often undermined by the accumulation of errors in intermediate steps.<n>This paper introduces an approach to calibrate the CoT reasoning accuracy by leveraging the model's intrinsic veracity encoding.
arXiv Detail & Related papers (2025-07-14T07:41:35Z)
Unveiling Confirmation Bias in Chain-of-Thought Reasoning [12.150655660758359]
Chain-of-thought (CoT) prompting has been widely adopted to enhance the reasoning capabilities of large language models (LLMs)<n>This work presents a novel perspective to understand CoT behavior through the lens of textitconfirmation bias in cognitive psychology.
arXiv Detail & Related papers (2025-06-14T01:30:17Z)
A Closer Look at Bias and Chain-of-Thought Faithfulness of Large (Vision) Language Models [53.18562650350898]
Chain-of-thought (CoT) reasoning enhances performance of large language models.<n>We present the first comprehensive study of CoT faithfulness in large vision-language models.
arXiv Detail & Related papers (2025-05-29T18:55:05Z)
The Curse of CoT: On the Limitations of Chain-of-Thought in In-Context Learning [56.574829311863446]
Chain-of-Thought (CoT) prompting has been widely recognized for its ability to enhance reasoning capabilities in large language models (LLMs)<n>We demonstrate that CoT and its reasoning variants consistently underperform direct answering across varying model scales and benchmark complexities.<n>Our analysis uncovers a fundamental hybrid mechanism of explicit-implicit reasoning driving CoT's performance in pattern-based ICL.
arXiv Detail & Related papers (2025-04-07T13:51:06Z)
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? [57.17826305464394]
o1-like models produce the long Chain-of-Thought (CoT) reasoning steps to improve the reasoning abilities of existing Large Language Models (LLMs)<n>We introduce the DeltaBench, including the generated long CoTs from different o1-like models for different reasoning tasks.<n>Based on DeltaBench, we first perform fine-grained analysis of the generated long CoTs to discover the effectiveness and efficiency of different o1-like models.
arXiv Detail & Related papers (2025-02-26T17:59:27Z)
When More is Less: Understanding Chain-of-Thought Length in LLMs [51.631483479081645]
Large Language Models (LLMs) employ Chain-of-Thought (CoT) reasoning to deconstruct complex problems.<n>This paper argues that longer CoTs are often presumed superior, arguing that longer is not always better.
arXiv Detail & Related papers (2025-02-11T05:28:59Z)
A Hopfieldian View-based Interpretation for Chain-of-Thought Reasoning [48.51969964676017]
Chain-of-Thought (CoT) holds a significant place in augmenting the reasoning performance for large language models. We propose a Read-and-Control approach for controlling the accuracy of CoT.
arXiv Detail & Related papers (2024-06-18T04:07:13Z)
Towards Better Chain-of-Thought: A Reflection on Effectiveness and Faithfulness [17.6082037230676]
Chain-of-thought (CoT) prompting demonstrates varying performance under different reasoning tasks.<n>Previous work attempts to evaluate it but falls short in providing an in-depth analysis of patterns that influence the CoT.<n>We identify key factors that influence CoT effectiveness on performance improvement, including problem difficulty, information gain, and information flow.
arXiv Detail & Related papers (2024-05-29T09:17:46Z)
Measuring Faithfulness in Chain-of-Thought Reasoning [19.074147845029355]
Large language models (LLMs) perform better when they produce step-by-step, "Chain-of-Thought" (CoT) reasoning before answering a question. It is unclear if the stated reasoning is a faithful explanation of the model's actual reasoning (i.e., its process for answering the question) We investigate hypotheses for how CoT reasoning may be unfaithful, by examining how the model predictions change when we intervene on the CoT.
arXiv Detail & Related papers (2023-07-17T01:08:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.