Fugu-MT 論文翻訳(概要): Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

論文の概要: Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

arxiv url: http://arxiv.org/abs/2601.02989v1
Date: Tue, 06 Jan 2026 12:58:27 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-07 17:02:12.936799
Title: Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy
Title（参考訳）: システム2戦略によるLLMにおける大規模カウントの機械論的解釈可能性
Authors: Hosein Hasani, Mohammadali Banayeeanzade, Ali Nafisi, Sadegh Mohammadian, Fatemeh Askari, Mobin Bagherian, Amirmohammad Izadi, Mahdieh Soleymani Baghshah,
Abstract要約: 大規模言語モデル(LLM)は、タスクのカウントにおいて体系的な制限を示す。本稿では,システム2の認知プロセスにインスパイアされた簡易なテストタイム戦略を提案する。
参考スコア（独自算出の注目度）: 9.93179257715309
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Large language models (LLMs), despite strong performance on complex mathematical problems, exhibit systematic limitations in counting tasks. This issue arises from architectural limits of transformers, where counting is performed across layers, leading to degraded precision for larger counting problems due to depth constraints. To address this limitation, we propose a simple test-time strategy inspired by System-2 cognitive processes that decomposes large counting tasks into smaller, independent sub-problems that the model can reliably solve. We evaluate this approach using observational and causal mediation analyses to understand the underlying mechanism of this System-2-like strategy. Our mechanistic analysis identifies key components: latent counts are computed and stored in the final item representations of each part, transferred to intermediate steps via dedicated attention heads, and aggregated in the final stage to produce the total count. Experimental results demonstrate that this strategy enables LLMs to surpass architectural limitations and achieve high accuracy on large-scale counting tasks. This work provides mechanistic insight into System-2 counting in LLMs and presents a generalizable approach for improving and understanding their reasoning behavior.
Abstract（参考訳）: 大規模言語モデル (LLM) は、複雑な数学的問題に強い性能を持つにもかかわらず、タスクのカウントにおいて体系的な制限を示す。この問題は、層をまたいでカウントを行うトランスフォーマーのアーキテクチャ上の限界から生じ、深さ制約によるカウントの問題の精度が低下する。この制限に対処するため,システム2の認知プロセスにインスパイアされた簡易なテストタイム戦略を提案し,大規模カウントタスクを,モデルが確実に解決可能な,より小さく独立したサブプロブレムに分解する。本手法を観察・因果媒介分析を用いて評価し,システム2様戦略の基盤となるメカニズムを解明する。我々の力学解析は,各部分の最終項目表現に潜伏数を計算して格納し,専用の注意頭を通して中間段階に移動し,最終段階に集約して総数を生成する,という重要な要素を同定する。実験結果から,LLMはアーキテクチャ上の制約を超越し,大規模カウントタスクの高精度化を実現することができることがわかった。この研究は、LLMにおけるSystem-2の数え上げに関する力学的な洞察を与え、それらの推論行動を改善し、理解するための一般化可能なアプローチを示す。

論文の概要: Mechanistic Interpretability of Large-Scale Counting in LLMs through a System-2 Strategy

関連論文リスト