Fugu-MT 論文翻訳(概要): Reasoning Models Know What's Important, and Encode It in Their Activations

論文の概要: Reasoning Models Know What's Important, and Encode It in Their Activations

arxiv url: http://arxiv.org/abs/2604.18307v1
Date: Mon, 20 Apr 2026 14:15:57 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-21 21:52:52.925565
Title: Reasoning Models Know What's Important, and Encode It in Their Activations
Title（参考訳）: 推論モデルは何が重要かを理解し、それをアクティベーションにエンコードする
Authors: Yaniv Nikankin, Martin Tutek, Tomer Ashuach, Jonathan Rosenfeld, Yonatan Belinkov,
Abstract要約: モデルアクティベーションには重要な推論ステップを特定するためのトークンよりも多くの情報が含まれていることが分かりました。モデルのアクティベーションをトレーニングして重要度を予測することで、モデルがステップ重要度の内部表現を符号化することを示す。本研究は, アクティベーションの分析により, 表面レベルでのアプローチが根本的に損なわれるという推論の側面を明らかにすることを示唆している。
参考スコア（独自算出の注目度）: 36.53191165682352
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Language models often solve complex tasks by generating long reasoning chains, consisting of many steps with varying importance. While some steps are crucial for generating the final answer, others are removable. Determining which steps matter most, and why, remains an open question central to understanding how models process reasoning. We investigate if this question is best approached through model internals or through tokens of the reasoning chain itself. We find that model activations contain more information than tokens for identifying important reasoning steps. Crucially, by training probes on model activations to predict importance, we show that models encode an internal representation of step importance, even prior to the generation of subsequent steps. This internal representation of importance generalizes across models, is distributed across layers, and does not correlate with surface-level features, such as a step's relative position or its length. Our findings suggest that analyzing activations can reveal aspects of reasoning that surface-level approaches fundamentally miss, indicating that reasoning analyses should look into model internals.
Abstract（参考訳）: 言語モデルは、様々な重要性を持つ多くのステップからなる長い推論チェーンを生成することで、複雑なタスクを解決することが多い。いくつかのステップは最終回答を生成するのに不可欠だが、他のステップは取り除くことができる。どのステップが最も重要か、なぜかを決定することは、モデルが推論をどのように処理するかを理解するためのオープンな疑問である。この質問がモデル内部や推論チェーン自体のトークンを通じて最もよくアプローチされているかどうかを調査する。モデルアクティベーションには重要な推論ステップを特定するためのトークンよりも多くの情報が含まれていることが分かりました。重要なことは、モデルアクティベーションをトレーニングして重要度を予測することで、モデルがステップ重要度の内部表現を符号化していることを示す。この重要性の内的表現は、モデルをまたいで一般化し、層に分散し、ステップの相対位置やその長さのような表面的な特徴と相関しない。本研究は, アクティベーションの分析により, 表面レベルでのアプローチが根本的に損なわれるという推論の側面を明らかにし, モデル内部を考察すべきことを示唆している。

関連論文リスト

PRISM: A Dual View of LLM Reasoning through Semantic Flow and Latent Computation [15.91920027845529]
PRISM(Probabilistic Reasoning Inspection through Semantic and Implicit Modeling)は、両方のレベルを共同で分析するためのフレームワークおよび診断ツールである。これは、推論過程における系統的なパターンを明らかにし、失敗した軌道は非生産的な検証ループに閉じ込められる傾向にあることを示した。 PRISMは、最終タスクの精度にのみ依存するのではなく、これらの振る舞いを観測し分析可能にする。
論文参考訳（メタデータ） (2026-03-24T03:31:53Z)
Dynamics Within Latent Chain-of-Thought: An Empirical Study of Causal Structure [58.89643769707751]
表現空間における潜在連鎖を操作可能な因果過程として研究する。遅延ステップの予算は、均質な余分な深さよりも、非局所的なルーティングを備えたステージ機能のように振る舞う。これらの結果は、モード条件と安定性を意識した分析を、潜伏推論システムの解釈と改善のための信頼性の高いツールとして動機付けている。
論文参考訳（メタデータ） (2026-02-09T15:25:12Z)
Schoenfeld's Anatomy of Mathematical Reasoning by Language Models [56.656180566692946]
我々は、Schoenfeldのエピソード理論を誘導型中間スケールレンズとして採用し、ThinkARM(モデルにおける推論の解剖学)を紹介する。 ThinkARMは、推論トレースを分析、探索、実装、検証などの機能的推論ステップに明示的に抽象化する。エピソードレベルの表現は推論ステップを明確にし、現代の言語モデルにおける推論がどのように構造化され、安定化され、変更されるかの体系的な分析を可能にする。
論文参考訳（メタデータ） (2025-12-23T02:44:25Z)
A Survey of Inductive Reasoning for Large Language Models [55.23215679173251]
帰納的モードは知識の一般化に不可欠であり、人間の認知とよく一致している。帰納的推論の重要性にもかかわらず、体系的な要約は存在しない。本稿では,大規模言語モデルに対する帰納的推論の包括的調査を行う。
論文参考訳（メタデータ） (2025-10-11T11:45:38Z)
Internal states before wait modulate reasoning patterns [14.272989515787351]
我々は、DeepSeek-R1-Distill-Llama-8Bの複数の層でクロスコーダを訓練し、クロスコーダ設定に潜時帰属技術を導入する。待ちトークンの確率の促進と抑制に関係した機能の小さなセットを見つける。同定された特徴の多くは、実際に推論プロセスに関連があることが示されています。
論文参考訳（メタデータ） (2025-10-05T10:03:42Z)
Beyond Memorization: Extending Reasoning Depth with Recurrence, Memory and Test-Time Compute Scaling [60.63703438729223]
異なるアーキテクチャとトレーニング手法がモデル多段階推論能力にどのように影響するかを示す。我々は,逐次計算においてモデル深度の増加が重要な役割を担っていることを確認した。
論文参考訳（メタデータ） (2025-08-22T18:57:08Z)
Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models [56.37421741507468]
CoT推論は大規模言語モデル(LLM)の性能を大幅に向上させた。本稿では,その重要性の尺度としてパープレキシティを用いた批判的推論ステップの同定手法を提案する。
論文参考訳（メタデータ） (2025-02-18T20:04:51Z)
Mechanistic Unveiling of Transformer Circuits: Self-Influence as a Key to Model Reasoning [9.795934690403374]
このような課題を解決するために言語モデルでどのような多段階推論機構が使われているのかはいまだ不明である。回路解析と自己影響関数を用いて、推論過程を通して各トークンの変動の重要性を評価する。提案手法は,モデルが使用する人間の解釈可能な推論過程を明らかにする。
論文参考訳（メタデータ） (2025-02-13T07:19:05Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。