Fugu-MT 論文翻訳(概要): Thought Anchors: Which LLM Reasoning Steps Matter?

論文の概要: Thought Anchors: Which LLM Reasoning Steps Matter?

arxiv url: http://arxiv.org/abs/2506.19143v4
Date: Mon, 27 Oct 2025 12:36:23 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-28 17:41:21.671175
Title: Thought Anchors: Which LLM Reasoning Steps Matter?
Title（参考訳）: 思考アンカー:どのLDM推論ステップが重要か?
Authors: Paul C. Bogdan, Uzay Macar, Neel Nanda, Arthur Conmy,
Abstract要約: 文レベルでの推論トレースの分析は、推論過程を理解するための有望なアプローチである、と我々は主張する。本稿では,各文の対実的重要性を測定するブラックボックス手法を提案する。文文文の因果関係を推論トレースで調べることで,モデルの振る舞いを把握できることを示す。
参考スコア（独自算出の注目度）: 12.689309281941995
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Current frontier large-language models rely on reasoning to achieve state-of-the-art performance. Many existing interpretability are limited in this area, as standard methods have been designed to study single forward passes of a model rather than the multi-token computational steps that unfold during reasoning. We argue that analyzing reasoning traces at the sentence level is a promising approach to understanding reasoning processes. We introduce a black-box method that measures each sentence's counterfactual importance by repeatedly sampling replacement sentences from the model, filtering for semantically different ones, and continuing the chain of thought from that point onwards to quantify the sentence's impact on the distribution of final answers. We discover that certain sentences can have an outsized impact on the trajectory of the reasoning trace and final answer. We term these sentences \textit{thought anchors}. These are generally planning or uncertainty management sentences, and specialized attention heads consistently attend from subsequent sentences to thought anchors. We further show that examining sentence-sentence causal links within a reasoning trace gives insight into a model's behavior. Such information can be used to predict a problem's difficulty and the extent different question domains involve sequential or diffuse reasoning. As a proof-of-concept, we demonstrate that our techniques together provide a practical toolkit for analyzing reasoning models by conducting a detailed case study of how the model solves a difficult math problem, finding that our techniques yield a consistent picture of the reasoning trace's structure. We provide an open-source tool (thought-anchors.com) for visualizing the outputs of our methods on further problems. The convergence across our methods shows the potential of sentence-level analysis for a deeper understanding of reasoning models.
Abstract（参考訳）: 現在のフロンティアの大規模言語モデルは、最先端のパフォーマンスを達成するために推論に依存している。多くの既存の解釈可能性はこの領域において制限されており、標準的な手法は推論中に展開されるマルチトークンの計算ステップではなく、モデルの単一前方通過を研究するように設計されている。文レベルでの推論トレースの分析は、推論過程を理解するための有望なアプローチである、と我々は主張する。モデルから置換文を繰り返しサンプリングし、意味的に異なる文をフィルタリングし、その点から思考の連鎖を継続することにより、各文が最終回答の分布に与える影響を定量化するブラックボックス手法を提案する。我々は,ある文が推論の軌跡と最終回答の軌跡に大きな影響を与えることを発見した。これらの文をtextit{ Thought anchors} と呼ぶ。これらは概して計画的または不確実な管理文であり、特別な注意頭は引き続き、後続の文から思考のアンカーへの参加である。さらに、推論トレース内の文文文因果関係を調べることで、モデルの振る舞いに関する洞察が得られることを示す。このような情報は、問題の難易度を予測し、異なる質問領域が連続的または拡散的推論を伴う範囲を予測するのに使うことができる。概念実証として,本手法は,モデルが難解な問題をどのように解決するかの詳細なケーススタディを行い,この手法が推論トレースの構造を一貫した図形を生成することを明らかにすることによって,推論モデルを分析するための実用的なツールキットを提供する。今後の課題に対して,提案手法のアウトプットを可視化するためのオープンソースツール(Thought-anchors.com)を提供する。本手法間の収束は,推論モデルをより深く理解するための文レベル解析の可能性を示している。

論文の概要: Thought Anchors: Which LLM Reasoning Steps Matter?

関連論文リスト