Fugu-MT 論文翻訳(概要): LightThinker++: From Reasoning Compression to Memory Management

論文の概要: LightThinker++: From Reasoning Compression to Memory Management

arxiv url: http://arxiv.org/abs/2604.03679v1
Date: Sat, 04 Apr 2026 10:46:09 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-07 15:49:18.720044
Title: LightThinker++: From Reasoning Compression to Memory Management
Title（参考訳）: LightThinker++:圧縮の推論からメモリ管理へ
Authors: Yuqi Zhu, Jintian Zhang, Zhenjie Wan, Yujie Luo, Shuofei Qiao, Zhengke Gui, Da Zheng, Lei Liang, Huajun Chen, Ningyu Zhang,
Abstract要約: 大きな言語モデル(LLM)は複雑な推論において優れているが、その効率は長い思考トレースの認知的オーバーヘッドの増加によって制限される。 LLMが動的に中間的思考をコンパクトな意味表現に圧縮できる方法であるLightThinkerを提案する。私たちはフレームワークをLightThinker++に進化させ、Explicit Adaptive Memory Managementを導入しました。
参考スコア（独自算出の注目度）: 61.2260619973687
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large language models (LLMs) excel at complex reasoning, yet their efficiency is limited by the surging cognitive overhead of long thought traces. In this paper, we propose LightThinker, a method that enables LLMs to dynamically compress intermediate thoughts into compact semantic representations. However, static compression often struggles with complex reasoning where the irreversible loss of intermediate details can lead to logical bottlenecks. To address this, we evolve the framework into LightThinker++, introducing Explicit Adaptive Memory Management. This paradigm shifts to behavioral-level management by incorporating explicit memory primitives, supported by a specialized trajectory synthesis pipeline to train purposeful memory scheduling. Extensive experiments demonstrate the framework's versatility across three dimensions. (1) LightThinker reduces peak token usage by 70% and inference time by 26% with minimal accuracy loss. (2) In standard reasoning, LightThinker++ slashes peak token usage by 69.9% while yielding a +2.42% accuracy gain under the same context budget for maximum performance. (3) Most notably, in long-horizon agentic tasks, it maintains a stable footprint beyond 80 rounds (a 60%-70% reduction), achieving an average performance gain of 14.8% across different complex scenarios. Overall, our work provides a scalable direction for sustaining deep LLM reasoning over extended horizons with minimal overhead.
Abstract（参考訳）: 大きな言語モデル(LLM)は複雑な推論において優れているが、その効率は長い思考トレースの認知的オーバーヘッドの増加によって制限される。本稿では,LLMが中間的思考をコンパクトな意味表現に動的に圧縮できる方法であるLightThinkerを提案する。しかし静的圧縮は、中間詳細の不可逆的な損失が論理的ボトルネックにつながるような複雑な推論に苦慮することが多い。これを解決するために、フレームワークをLightThinker++に進化させ、Explicit Adaptive Memory Managementを導入しました。このパラダイムは、明示的なメモリプリミティブを取り入れ、目的のメモリスケジューリングをトレーニングするための特別なトラジェクトリ合成パイプラインによってサポートされることによって、行動レベルの管理にシフトする。大規模な実験は、3次元にわたるフレームワークの汎用性を実証している。 1) LightThinkerはピークトークンの使用量を70%削減し,推定時間を26%削減した。 2) 標準的な推論では、LightThinker++はピークトークンの使用量を69.9%削減し、最大パフォーマンスのための同じコンテキスト予算下では、+2.42%の精度向上を実現している。 (3)特に、長期のエージェントタスクでは、80ラウンド(60%から70%の削減)を超える安定したフットプリントを維持し、様々な複雑なシナリオで平均14.8%のパフォーマンス向上を達成する。全体として、我々の研究は、最小限のオーバーヘッドで拡張地平線上での深いLLM推論を持続するためのスケーラブルな方向を提供する。

論文の概要: LightThinker++: From Reasoning Compression to Memory Management

関連論文リスト