Fugu-MT 論文翻訳(概要): A State-Transition Framework for Efficient LLM Reasoning

論文の概要: A State-Transition Framework for Efficient LLM Reasoning

arxiv url: http://arxiv.org/abs/2602.01198v1
Date: Sun, 01 Feb 2026 12:40:40 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-03 19:28:33.656162
Title: A State-Transition Framework for Efficient LLM Reasoning
Title（参考訳）: 効率的なLLM推論のための状態遷移フレームワーク
Authors: Liang Zhang, Yu Zhao, Longyue Wang, Tianqi Shi, Weihua Luo, Kaifu Zhang, Jinsong Su,
Abstract要約: ロングチェイン・オブ・ソート (Long Chain-of-Thought, CoT) 推論は、複雑な推論タスクにおいて、Large Language Models (LLM) のパフォーマンスを大幅に改善する。既存の研究は通常、COT配列を圧縮することでLCMの推論効率を高める。状態遷移過程としてLLMの推論過程をモデル化する効率的な推論フレームワークを提案する。
参考スコア（独自算出の注目度）: 58.18141262230392
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: While Long Chain-of-Thought (CoT) reasoning significantly improves Large Language Models (LLMs) performance on complex reasoning tasks, the substantial computational and memory costs of generating long CoT sequences limit their efficiency and practicality. Existing studies usually enhance the reasoning efficiency of LLMs by compressing CoT sequences. However, this approach conflicts with test-time scaling, limiting the reasoning capacity of LLMs. In this paper, we propose an efficient reasoning framework that models the reasoning process of LLMs as a state-transition process. Specifically, we first apply a linear attention mechanism to estimate the LLM's reasoning state, which records the historical reasoning information from previous reasoning steps. Then, based on the query prompt and the reasoning state, the LLM can efficiently perform the current reasoning step and update the state. With the linear attention, each token in the current reasoning step can directly retrieve relevant historical reasoning information from the reasoning state, without explicitly attending to tokens in previous reasoning steps. In this way, the computational complexity of attention is reduced from quadratic to linear, significantly improving the reasoning efficiency of LLMs. In addition, we propose a state-based reasoning strategy to mitigate the over-thinking issue caused by noisy reasoning steps. Extensive experiments across multiple datasets and model sizes demonstrate that our framework not only improves the reasoning efficiency of LLMs but also enhances their reasoning performance.
Abstract（参考訳）: ロングチェーン・オブ・ソート(CoT)推論は複雑な推論タスクにおけるLarge Language Models(LLM)のパフォーマンスを著しく向上させるが、長いCoTシーケンスを生成するための計算とメモリコストは、その効率と実用性を制限する。既存の研究は通常、COT配列を圧縮することでLCMの推論効率を高める。しかし、このアプローチはテスト時間のスケーリングと矛盾し、LLMの推論能力を制限する。本稿では, LLMの推論過程を状態遷移過程としてモデル化する効率的な推論フレームワークを提案する。具体的には、まず、LLMの推論状態を推定するために線形注意機構を適用し、過去の推論ステップから歴史的推論情報を記録する。そして、クエリプロンプトと推論状態に基づいて、LLMが現在の推論ステップを効率的に実行し、状態を更新する。線形注意により、現在の推論ステップの各トークンは、以前の推論ステップでトークンに明示的に参加することなく、関連する歴史的推論情報を推論状態から直接検索することができる。このように、注意の計算複雑性を2次から線形に減らし、LCMの推論効率を大幅に改善する。また,ノイズ推論のステップによって生じる過度な問題を緩和する国家ベースの推論戦略を提案する。複数のデータセットとモデルサイズにわたる大規模な実験により、我々のフレームワークはLCMの推論効率を向上するだけでなく、推論性能も向上することが示された。

論文の概要: A State-Transition Framework for Efficient LLM Reasoning

関連論文リスト