Fugu-MT 論文翻訳(概要): Efficient Reasoning with Balanced Thinking

論文の概要: Efficient Reasoning with Balanced Thinking

arxiv url: http://arxiv.org/abs/2603.12372v1
Date: Thu, 12 Mar 2026 18:48:07 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-16 17:38:11.728565
Title: Efficient Reasoning with Balanced Thinking
Title（参考訳）: バランス思考による効率的な推論
Authors: Yulin Li, Tengyao Tu, Li Ding, Junjie Wang, Huiling Zhen, Yixin Chen, Yong Li, Zhuotao Tian,
Abstract要約: 大きな推論モデル(LRM)は、顕著な推論能力を示している。 LRMは、単純な問題に対する過剰な計算ステップや過小評価に悩まされることが多い。バランスの取れた思考による効率的な推論を実現するためのトレーニングフリーフレームワークであるReBalanceを提案する。
参考スコア（独自算出の注目度）: 31.690456174428068
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Large Reasoning Models (LRMs) have shown remarkable reasoning capabilities, yet they often suffer from overthinking, expending redundant computational steps on simple problems, or underthinking, failing to explore sufficient reasoning paths despite inherent capabilities. These issues lead to inefficiencies and potential inaccuracies, limiting practical deployment in resource-constrained settings. Existing methods to mitigate overthinking, such as suppressing reflective keywords or adjusting reasoning length, may inadvertently induce underthinking, compromising accuracy. Therefore, we propose ReBalance, a training-free framework that achieves efficient reasoning with balanced thinking. ReBalance leverages confidence as a continuous indicator of reasoning dynamics, identifying overthinking through high confidence variance and underthinking via consistent overconfidence. By aggregating hidden states from a small-scale dataset into reasoning mode prototypes, we compute a steering vector to guide LRMs' reasoning trajectories. A dynamic control function modulates this vector's strength and direction based on real-time confidence, pruning redundancy during overthinking, and promoting exploration during underthinking. Extensive experiments conducted on four models ranging from 0.5B to 32B, and across nine benchmarks in math reasoning, general question answering, and coding tasks demonstrate that ReBalance effectively reduces output redundancy while improving accuracy, offering a general, training-free, and plug-and-play strategy for efficient and robust LRM deployment. Code is available at https://github.com/yu-lin-li/ReBalance .
Abstract（参考訳）: 大規模推論モデル(LRM)は顕著な推論能力を示してきたが、単純な問題に対する冗長な計算手順を過度に検討したり、あるいは未考に陥り、固有の能力にもかかわらず十分な推論経路の探索に失敗したりすることが多い。これらの問題は非効率性と潜在的な不正確性をもたらし、リソース制約された設定での実践的なデプロイを制限します。リフレクティブキーワードの抑制や推論長の調整など、過度の思考を緩和する既存の方法は、必然的に過度の思考を誘発し、精度を損なう可能性がある。そこで本稿では,バランスの取れた思考による効率的な推論を実現するためのトレーニングフリーフレームワークであるReBalanceを提案する。 ReBalanceは、信頼性を推論ダイナミクスの連続的な指標として活用し、高い信頼性の分散を通じて過度に考え、一貫性のある過度に考え直した。小規模データセットから推論モードのプロトタイプに隠れた状態を集約することにより、ステアリングベクトルを計算し、LEMの推論軌道を導出する。動的制御関数は、このベクトルの強さと方向をリアルタイムの信頼度に基づいて変調し、過度に考えるときに冗長性を刈り取り、過度に考えるときの探索を促進する。 0.5Bから32Bまでの4つのモデルと、数学推論、一般質問応答、コーディングタスクの9つのベンチマークで実施された大規模な実験により、ReBalanceは精度を改善しながら出力の冗長性を効果的に低減し、汎用的でトレーニングフリーでプラグイン・アンド・プレイの戦略を効率よくかつ堅牢に提供した。コードはhttps://github.com/yu-lin-li/ReBalanceで入手できる。

論文の概要: Efficient Reasoning with Balanced Thinking

関連論文リスト