Fugu-MT 論文翻訳(概要): Stop When Further Reasoning Won't Help: Attention-State Adaptive Generation in Reasoning Models

論文の概要: Stop When Further Reasoning Won't Help: Attention-State Adaptive Generation in Reasoning Models

arxiv url: http://arxiv.org/abs/2606.15070v1
Date: Sat, 13 Jun 2026 02:58:29 GMT
ステータス: 翻訳完了
システム内更新日: 2026-06-16 16:21:32.762255
Title: Stop When Further Reasoning Won't Help: Attention-State Adaptive Generation in Reasoning Models
Title（参考訳）: 余分な推論は役に立たない: 推論モデルにおける注意状態適応生成
Authors: Jiakai Li, Ke Qin, Rongzheng Wang, Yizhuo Ma, Qizhi Chen, Muquan Li, Shuang Liang,
Abstract要約: 大規模推論モデル(LRM)は、明確な連鎖推論プロセスによって複雑な問題を解くことができる。 LRMはしばしば過大評価に悩まされ、冗長なトークン出力と劣化した精度をもたらす。本稿では,モデルの推論状態を推定し,生成戦略を適応的に調整するASAGを提案する。
参考スコア（独自算出の注目度）: 11.158010513386666
License: http://creativecommons.org/licenses/by/4.0/
Abstract: By incorporating test-time compute scaling, large reasoning models (LRMs) can solve complex problems through explicit chain-of-thought (CoT) reasoning processes. However, they often suffer from overthinking, resulting in redundant token outputs and degraded accuracy. Current methods to mitigate this issue remain limited: training-based approaches require substantial computational resources, while training-free methods rely on well-crafted prompts or unreliable confidence signals. In this work, we investigate early stopping from the perspective of attention distributions and propose a simple method, ASAG, which infers the model's reasoning state and adaptively adjusts the generation strategy. The proposed framework is training-free and plug-and-play, enabling seamless integration into existing LRMs. Extensive experiments on nine benchmarks demonstrate consistent improvements across mainstream LRMs with varying parameter scales, including the DeepSeek-R1-Distill and Qwen3 series. Specifically, ASAG improves average accuracy by 3.2% while reducing the number of generated tokens by nearly 40% across all reasoning tasks on Qwen3-8B.
Abstract（参考訳）: テストタイムの計算スケーリングを取り入れることで、大きな推論モデル(LRM)は、明示的なチェーン・オブ・ソート(CoT)推論プロセスを通じて複雑な問題を解決することができる。しかし、それらはしばしば過度な考えに悩まされ、結果として冗長なトークン出力と精度が低下する。トレーニングベースのアプローチは相当量の計算資源を必要とするのに対して、トレーニングフリーの手法は巧妙なプロンプトや信頼性の低い信号に依存している。本研究では,注意分布の観点から早期停止について検討し,モデルの推論状態を推論し,生成戦略を適応的に調整する簡易な手法ASAGを提案する。提案するフレームワークは、トレーニングフリーでプラグイン・アンド・プレイであり、既存の LRM へのシームレスな統合を可能にする。 9つのベンチマークに関する大規模な実験では、DeepSeek-R1-Distill や Qwen3 シリーズなど、パラメータスケールの異なる主要な LRM に対して一貫した改善が示されている。具体的には、ASAGは平均精度を3.2%改善し、Qwen3-8B上の全ての推論タスクで生成されたトークンの数を40%近く削減した。

論文の概要: Stop When Further Reasoning Won't Help: Attention-State Adaptive Generation in Reasoning Models

関連論文リスト