Fugu-MT 論文翻訳(概要): Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

論文の概要: Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

arxiv url: http://arxiv.org/abs/2512.24574v1
Date: Wed, 31 Dec 2025 02:46:04 GMT
ステータス: 翻訳完了
システム内更新日: 2026-01-01 23:27:28.545031
Title: Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time
Title（参考訳）: 実験時間における推論モデルの認知行動の理解とステアリング
Authors: Zhenyu Zhang, Xiaoxia Wu, Zhongzhu Zhou, Qingyang Wu, Yineng Zhang, Pragaash Ponnusamy, Harikaran Subbaraj, Jue Wang, Shuaiwen Leon Song, Ben Athiwaratkun,
Abstract要約: 本研究では、推論軌跡の構造と、異なる認知行動と相関する特別な注意点を明らかにする。テスト時間における認知推論ステアリングのトレーニング不要な方法であるCRESTを提案する。 CRESTは非生産的推論の振る舞いを適応的に抑制し、高い精度と低い計算コストをもたらす。
参考スコア（独自算出の注目度）: 22.9491443902816
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Large Language Models (LLMs) often rely on long chain-of-thought (CoT) reasoning to solve complex tasks. While effective, these trajectories are frequently inefficient, leading to high latency from excessive token generation, or unstable reasoning that alternates between underthinking (shallow, inconsistent steps) and overthinking (repetitive, verbose reasoning). In this work, we study the structure of reasoning trajectories and uncover specialized attention heads that correlate with distinct cognitive behaviors such as verification and backtracking. By lightly intervening on these heads at inference time, we can steer the model away from inefficient modes. Building on this insight, we propose CREST, a training-free method for Cognitive REasoning Steering at Test-time. CREST has two components: (1) an offline calibration step that identifies cognitive heads and derives head-specific steering vectors, and (2) an inference-time procedure that rotates hidden representations to suppress components along those vectors. CREST adaptively suppresses unproductive reasoning behaviors, yielding both higher accuracy and lower computational cost. Across diverse reasoning benchmarks and models, CREST improves accuracy by up to 17.5% while reducing token usage by 37.6%, offering a simple and effective pathway to faster, more reliable LLM reasoning.
Abstract（参考訳）: 大きな言語モデル(LLM)は、複雑なタスクを解決するために長いチェーン・オブ・シント(CoT)推論に依存することが多い。効果はあるものの、これらの軌道はしばしば非効率であり、過剰なトークン生成や不安定な推論(過度、不整合なステップ)と過度に考える(反復的、冗長な推論)の間で交互に発生する。本研究は,検証や追跡などの認知行動と相関する,推論軌跡の構造と特別な注意のヘッドを明らかにすることを目的とする。推論時にこれらの頭部に軽く介入することで、非効率なモードからモデルを遠ざけることができる。この知見に基づいて、テスト時間における認知推論ステアリングのトレーニング不要な方法であるCRESTを提案する。 CRESTには,(1)認知的頭部を識別し,頭部特異的なステアリングベクトルを導出するオフラインキャリブレーションステップ,(2)隠れ表現を回転させてそれらのベクトルに沿ってコンポーネントを抑圧する推論時間プロシージャの2つのコンポーネントがある。 CRESTは非生産的推論の振る舞いを適応的に抑制し、高い精度と低い計算コストをもたらす。さまざまな推論ベンチマークやモデルを通じて、CRESTはトークンの使用量を37.6%削減し、より高速で信頼性の高いLSM推論のためのシンプルで効果的な経路を提供するとともに、精度を最大17.5%向上させる。

論文の概要: Understanding and Steering the Cognitive Behaviors of Reasoning Models at Test-Time

関連論文リスト