Related papers: Thought Anchors: Which LLM Reasoning Steps Matter?

Thought Anchors: Which LLM Reasoning Steps Matter?

URL: http://arxiv.org/abs/2506.19143v3
Date: Tue, 05 Aug 2025 20:34:19 GMT
Title: Thought Anchors: Which LLM Reasoning Steps Matter?
Authors: Paul C. Bogdan, Uzay Macar, Neel Nanda, Arthur Conmy,
Abstract summary: We argue that analyzing reasoning traces at the sentence level is a promising approach to understanding reasoning processes.<n>Each method provides evidence for the existence of thought anchors, reasoning steps that have outsized importance.<n>We present a case study showing converging patterns across methods that map how a model performs multi-step reasoning.
Score: 3.4384069916863913
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Reasoning large language models have recently achieved state-of-the-art performance in many fields. However, their long-form chain-of-thought reasoning creates interpretability challenges as each generated token depends on all previous ones, making the computation harder to decompose. We argue that analyzing reasoning traces at the sentence level is a promising approach to understanding reasoning processes. We present three complementary attribution methods: (1) a black-box method measuring each sentence's counterfactual importance by comparing final answers across 100 rollouts conditioned on the model generating that sentence or one with a different meaning; (2) a white-box method of aggregating attention patterns between pairs of sentences, which identified "broadcasting" sentences that receive disproportionate attention from all future sentences via "receiver" attention heads; (3) a causal attribution method measuring logical connections between sentences by suppressing attention toward one sentence and measuring the effect on each future sentence's tokens. Each method provides evidence for the existence of thought anchors, reasoning steps that have outsized importance and that disproportionately influence the subsequent reasoning process. These thought anchors are typically planning or backtracking sentences. We provide an open-source tool (www.thought-anchors.com) for visualizing the outputs of our methods, and present a case study showing converging patterns across methods that map how a model performs multi-step reasoning. The consistency across methods demonstrates the potential of sentence-level analysis for a deeper understanding of reasoning models.

Related papers

Think Clearly: Improving Reasoning via Redundant Token Pruning [57.01254508252785]
We show that deliberately removing redundancy in the reasoning process significantly improves performance.<n>We demonstrate that our method significantly improves overall accuracy across reasoning-intensive benchmarks without any training.
arXiv Detail & Related papers (2025-06-17T06:04:01Z)
Beyond the Last Answer: Your Reasoning Trace Uncovers More than You Think [51.0691253204425]
We analyze intermediate reasoning steps, termed subthoughts, to answer two questions: Does the final answer reliably represent the model's optimal conclusion?<n>Our approach involves segmenting a reasoning trace into sequential subthoughts based on linguistic cues.<n>We find that aggregating these answers by selecting the most frequent one (the mode) often yields significantly higher accuracy compared to relying solely on the answer derived from the original complete trace.
arXiv Detail & Related papers (2025-04-29T12:39:07Z)
How Do LLMs Perform Two-Hop Reasoning in Context? [76.79936191530784]
Two-hop reasoning refers to the process of inferring a conclusion by making two logical steps.<n>Despite recent progress in large language models (LLMs), we surprisingly find that they can fail at solving simple two-hop reasoning problems.<n>We train a 3-layer Transformer from scratch on a synthetic two-hop reasoning task and reverse-engineer its internal information flow.
arXiv Detail & Related papers (2025-02-19T17:46:30Z)
Self-Harmonized Chain of Thought [8.540320749424172]
Chain-of-thought (CoT) prompting has demonstrated the capacity of large language models to perform complex reasoning through intermediate steps.<n>We propose ECHO, a novel method that unifies diverse solution paths into a consistent and effective reasoning pattern.
arXiv Detail & Related papers (2024-09-06T06:57:04Z)
Contrastive Chain-of-Thought Prompting [74.10511560147293]
We propose contrastive chain of thought to enhance language model reasoning. Compared to the conventional chain of thought, our approach provides both valid and invalid reasoning demonstrations. Our experiments on reasoning benchmarks demonstrate that contrastive chain of thought can serve as a general enhancement of chain-of-thought prompting.
arXiv Detail & Related papers (2023-11-15T18:54:01Z)
Implicit Chain of Thought Reasoning via Knowledge Distillation [58.80851216530288]
Instead of explicitly producing the chain of thought reasoning steps, we use the language model's internal hidden states to perform implicit reasoning. We find that this approach enables solving tasks previously not solvable without explicit chain-of-thought, at a speed comparable to no chain-of-thought.
arXiv Detail & Related papers (2023-11-02T17:59:49Z)
HOP, UNION, GENERATE: Explainable Multi-hop Reasoning without Rationale Supervision [118.0818807474809]
This work proposes a principled, probabilistic approach for training explainable multi-hop QA systems without rationale supervision. Our approach performs multi-hop reasoning by explicitly modeling rationales as sets, enabling the model to capture interactions between documents and sentences within a document.
arXiv Detail & Related papers (2023-05-23T16:53:49Z)
Learning to Reason and Memorize with Self-Notes [51.17609489687686]
Large language models have been shown to struggle with multi-step reasoning. We propose a simple method for solving both of these problems by allowing the model to take Self-Notes.
arXiv Detail & Related papers (2023-05-01T14:02:48Z)
Narrative Incoherence Detection [76.43894977558811]
We propose the task of narrative incoherence detection as a new arena for inter-sentential semantic understanding. Given a multi-sentence narrative, decide whether there exist any semantic discrepancies in the narrative flow.
arXiv Detail & Related papers (2020-12-21T07:18:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.