Fugu-MT 論文翻訳(概要): When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

論文の概要: When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

arxiv url: http://arxiv.org/abs/2605.22873v1
Date: Wed, 20 May 2026 03:15:46 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-25 17:29:20.013095
Title: When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions
Title（参考訳）: LLMはいつ推論されるか?エントロピー相転移による力学系
Authors: Wei Xia, Haoqing Wang, Zhi-Hong Deng, Yehui Tang,
Abstract要約: CoT(Chain-of- Thought)推論はLLM機能拡張のデフォルト戦略となっているが、そのアプリケーションは根本的な疑問を提起している。 CoTはしばしば、トークンの消費を乗じながら、事実やオープンなタスクに限界あるいは負の利益をもたらす。 LLM推論はタスクやモデルの静的な特性ではなく,生成時に出現するエンファンダイナミックデコード状態であることを示す。軽量でトレーニング不要なルーティングである textbfEDRM (Entropy Dynamics-based Reasoning Manifold) を提案する。
参考スコア（独自算出の注目度）: 42.557327309177175
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Chain-of-thought (CoT) reasoning has become the default strategy for enhancing LLM capabilities, yet its application raises a fundamental question: when is explicit reasoning actually beneficial? Empirical evidence reveals a striking paradox: CoT often provides marginal or even negative gains on factual and open-ended tasks while multiplying token consumption. In this work, we show that LLM reasoning is not a static property of tasks or models, but a \emph{dynamic decoding state} that emerges during generation. Through systematic analysis, we find early-stage entropy dynamics provide a reliable signal of this state: tasks benefiting from CoT exhibit consistent entropy reduction, while others display unstable or increasing patterns. This behavior can be interpreted as a phase-transition-like shift from a high-entropy exploratory regime to a low-entropy structured reasoning regime. Based on these insights, we propose \textbf{EDRM} (Entropy Dynamics-based Reasoning Manifold), a lightweight and training-free routing framework that leverages early decoding entropy to adaptively select inference strategies. EDRM embeds entropy trajectories into a compact and interpretable manifold representation, enabling both zero-shot deployment and fine-grained instance-level adaptation. Across 15 benchmarks and 4 LLMs of varying scales and architectures, EDRM consistently outperforms static baselines. At the dataset level, EDRM achieves \textbf{41--55\%} token reduction while improving accuracy with as few as 50 calibration samples. At the instance level, it further improves accuracy by up to \textbf{4.7\%} while maintaining \textbf{27--45\%} token savings. These results suggest that reasoning should be invoked selectively rather than by default, and demonstrate the effectiveness of entropy-driven decoding control for efficient and adaptive LLM inference.
Abstract（参考訳）: CoT(Chain-of- Thought)推論はLLM機能拡張のデフォルト戦略となっているが、そのアプリケーションは根本的な疑問を提起している。 CoTはしばしば、トークンの消費を乗じながら、事実やオープンなタスクに対して、限界的あるいはネガティブな利益をもたらす。本研究では, LLM推論はタスクやモデルの静的な特性ではなく, 生成中に出現する 'emph{dynamic decoding state} であることを示す。系統解析により、初期エントロピー力学は、CoTの恩恵を受けるタスクが一貫したエントロピー還元を示し、他のタスクは不安定あるいは増大するパターンを示す。この挙動は、高エントロピー探索規則から低エントロピー構造推論規則への相転移様シフトと解釈できる。これらの知見に基づいて,早期復号エントロピーを利用して推論戦略を適応的に選択する軽量かつトレーニング不要なルーティングフレームワークであるtextbf{EDRM} (Entropy Dynamics-based Reasoning Manifold)を提案する。 EDRMはエントロピー軌道をコンパクトで解釈可能な多様体表現に埋め込み、ゼロショット展開とインスタンスレベルの微粒化を両立させることができる。 15のベンチマークと、さまざまなスケールとアーキテクチャの4つのLLMで、EDRMは静的ベースラインを一貫して上回っている。データセットレベルでは、EDRMは、50のキャリブレーションサンプルで精度を向上しながら、 \textbf{41--55\%}トークンの削減を達成する。インスタンスレベルでは、 \textbf{27--45\%}トークンの保存を維持しながら、 \textbf{4.7\%}までの精度をさらに向上する。これらの結果は、推論をデフォルトでではなく選択的に実行し、エントロピー駆動型復号制御の有効性を効率よく適応的なLLM推論に適用できることを示唆している。

論文の概要: When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions

関連論文リスト