Fugu-MT 論文翻訳(概要): A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

論文の概要: A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

arxiv url: http://arxiv.org/abs/2604.19572v2
Date: Tue, 28 Apr 2026 18:09:49 GMT
ステータス: 翻訳完了
システム内更新日: 2026-04-30 13:51:53.853305
Title: A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression
Title（参考訳）: 観測文脈圧縮による効率的な端末エージェントの自己進化フレームワーク
Authors: Jincheng Ren, Siwei Wu, Yizhi Li, Kang Zhu, Shu Xu, Boyu Feng, Ruibin Yuan, Wei Zhang, Riza Batista-Navarro, Jian Yang, Chenghua Lin,
Abstract要約: TACOは、既存の端末エージェントのためのプラグアンドプレイ、トレーニング不要、自己進化型ターミナルエージェント圧縮フレームワークである。相互作用軌跡から構造化圧縮規則を発見し、洗練し、再利用する。エージェントの足場とバックボーンモデル間のタスクパフォーマンスとトークン効率を一貫して改善する。
参考スコア（独自算出の注目度）: 39.60395856651371
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: As terminal agents scale to long-horizon, multi-turn workflows, a key bottleneck is not merely limited context length, but the accumulation of noisy terminal observations in the interaction history. Retaining raw observations preserves useful environment feedback, but also leads to context saturation and high token cost; conversely, naive compression may discard task-critical signals needed for subsequent actions. Because terminal environments are highly heterogeneous across repositories, commands, and execution states, heuristic-based or fixed-prompt compression methods are difficult to generalize. We propose TACO, a plug-and-play, training-free, self-evolving Terminal Agent Compression framework for existing terminal agents. TACO automatically discovers, refines, and reuses structured compression rules from interaction trajectories, enabling workflow-adaptive filtering of low-value terminal outputs while preserving task-relevant observations. Experiments on TerminalBench (TB 1.0 and TB 2.0) and four additional terminal-related benchmarks, including SWE-Bench Lite, CompileBench, DevEval, and CRUST-Bench, show that TACO consistently improves task performance and token efficiency across agent scaffolds and backbone models. On TerminalBench, TACO yields 1%-4% accuracy gains across strong agentic models and improves accuracy by around 2%-3% under the same token budget. On additional terminal-related benchmarks, it reduces total token consumption while maintaining or improving task success rates. These results suggest that self-evolving, workflow-adaptive observation compression is an effective path toward more reliable and efficient long-horizon terminal agents. The code is publicly available at https://github.com/multimodal-art-projection/TACO.
Abstract（参考訳）: 終端エージェントが長時間のマルチターンワークフローにスケールするにつれて、重要なボトルネックは単にコンテキストの長さに制限されるだけでなく、相互作用履歴におけるノイズの多い終端観測の蓄積である。生の観測を保持することは、有用な環境フィードバックを保持するだけでなく、コンテキスト飽和と高いトークンコストをもたらす。端末環境はリポジトリ,コマンド,実行状態間で非常に異質であるため,ヒューリスティックベースあるいは固定プロンプト圧縮法は一般化が難しい。本稿では,既存の端末エージェントを対象とした,プラグアンドプレイ,トレーニング不要,自己進化型端末エージェント圧縮フレームワークTACOを提案する。 TACOは、タスク関連観測を保存しながら、低値端末出力のワークフロー適応フィルタリングを可能にする、相互作用軌跡から構造化された圧縮ルールを自動的に発見、洗練、再利用する。 TerminalBench(TB 1.0とTB 2.0)およびSWE-Bench Lite、CompileBench、DevEval、CRUST-Benchを含む4つの端末関連ベンチマークの実験は、TACOがエージェントの足場とバックボーンモデル間でタスク性能とトークン効率を一貫して改善していることを示している。 TerminalBenchでは、TACOは強力なエージェントモデルに対して1%-4%の精度向上を実現し、同じトークン予算の下で約2%-3%の精度向上を実現している。追加の端末関連ベンチマークでは、タスクの成功率を維持したり改善したりしながら、トークンの総消費を減らす。これらの結果は, 自己進化型ワークフロー適応型観測圧縮が, より信頼性が高く, 効率的な長距離端末エージェントへの効果的な経路であることを示唆している。コードはhttps://github.com/multimodal-art-projection/TACO.comで公開されている。

論文の概要: A Self-Evolving Framework for Efficient Terminal Agents via Observational Context Compression

関連論文リスト