Fugu-MT 論文翻訳(概要): Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework

論文の概要: Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework

arxiv url: http://arxiv.org/abs/2509.14093v1
Date: Wed, 17 Sep 2025 15:33:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-18 18:41:50.904197
Title: Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework
Title（参考訳）: Adaptive Chain-of-Thought Compressionによる効率的な推論:自己最適化フレームワーク
Authors: Kerui Huang, Shuhan Liu, Xing Hu, Tongtong Xu, Lingfeng Bao, Xin Xia,
Abstract要約: Chain-of-Thought(CoT)推論はLarge Language Models(LLMs)を強化するより長いアウトプットは、レイテンシ、メモリ使用量、KV-cache要求を増加させる。精度を保ちながらCOTを圧縮する適応型フレームワークSEER(Self-Enhancing Efficient Reasoning)を提案する。
参考スコア（独自算出の注目度）: 10.148124073650349
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Chain-of-Thought (CoT) reasoning enhances Large Language Models (LLMs) by prompting intermediate steps, improving accuracy and robustness in arithmetic, logic, and commonsense tasks. However, this benefit comes with high computational costs: longer outputs increase latency, memory usage, and KV-cache demands. These issues are especially critical in software engineering tasks where concise and deterministic outputs are required. To investigate these trade-offs, we conduct an empirical study based on code generation benchmarks. The results reveal that longer CoT does not always help. Excessive reasoning often causes truncation, accuracy drops, and latency up to five times higher, with failed outputs consistently longer than successful ones. These findings challenge the assumption that longer reasoning is inherently better and highlight the need for adaptive CoT control. Motivated by this, we propose SEER (Self-Enhancing Efficient Reasoning), an adaptive framework that compresses CoT while preserving accuracy. SEER combines Best-of-N sampling with task-aware adaptive filtering, dynamically adjusting thresholds based on pre-inference outputs to reduce verbosity and computational overhead. We then evaluate SEER on three software engineering tasks and one math task. On average, SEER shortens CoT by 42.1%, improves accuracy by reducing truncation, and eliminates most infinite loops. These results demonstrate SEER as a practical method to make CoT-enhanced LLMs more efficient and robust, even under resource constraints.
Abstract（参考訳）: CoT(Chain-of-Thought)推論は、中間ステップを誘導し、算術、論理、コモンセンスタスクの精度と堅牢性を改善することで、Large Language Models(LLM)を強化する。しかし、この利点には高い計算コストが伴う。より長い出力はレイテンシ、メモリ使用量、KVキャッシュ要求を増大させる。これらの問題は、簡潔で決定論的なアウトプットを必要とするソフトウェアエンジニアリングのタスクにおいて特に重要である。これらのトレードオフを調べるため,コード生成ベンチマークに基づく実証的研究を行った。その結果、長いCoTは必ずしも役に立たないことがわかった。過剰な推論は、しばしばトランケーション、精度低下、そして最大5倍のレイテンシを引き起こす。これらの知見は、より長い推論が本質的に優れていると仮定し、適応的なCoT制御の必要性を強調している。そこで我々は,精度を保ちながらCoTを圧縮する適応型フレームワークSEER(Self-Enhancing Efficient Reasoning)を提案する。 SEERは、Best-of-Nサンプリングとタスク認識適応フィルタリングを組み合わせることで、事前推論出力に基づいてしきい値を動的に調整し、冗長性と計算オーバーヘッドを低減する。次に、3つのソフトウェアエンジニアリングタスクと1つの数学タスクでSEERを評価する。 SEERは平均でCoTを42.1%短縮し、トランケーションを減らして精度を向上し、ほとんどの無限ループを除去する。これらの結果から,SEER は資源制約下であっても,CoT 強化 LLM をより効率的かつ堅牢にするための実用的手法であることが示された。

論文の概要: Reasoning Efficiently Through Adaptive Chain-of-Thought Compression: A Self-Optimizing Framework

関連論文リスト