Fugu-MT 論文翻訳(概要): Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

論文の概要: Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

arxiv url: http://arxiv.org/abs/2602.04284v1
Date: Wed, 04 Feb 2026 07:26:23 GMT
ステータス: 翻訳完了
システム内更新日: 2026-02-05 19:45:11.420382
Title: Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning
Title（参考訳）: エージェント・オミット:エージェント強化学習による適応的思考・観察のための効果的なLDMエージェントの訓練
Authors: Yansong Ning, Jun Fang, Naiqiang Tan, Hao Liu,
Abstract要約: マルチターンエージェント環境相互作用におけるエージェント思考と観察の管理は、効率を改善するための新たな戦略である。本稿では,LLMエージェントが冗長な思考や観察を適応的に省略することを可能にする統合トレーニングフレームワークであるAgent-Omitを提案する。実験の結果, 構築したAgen-Omit-8Bは, 7つのLLMエージェントに匹敵する性能を得ることができた。
参考スコア（独自算出の注目度）: 15.39565540937229
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Managing agent thought and observation during multi-turn agent-environment interactions is an emerging strategy to improve agent efficiency. However, existing studies treat the entire interaction trajectories equally, overlooking the thought necessity and observation utility varies across turns. To this end, we first conduct quantitative investigations into how thought and observation affect agent effectiveness and efficiency. Based on our findings, we propose Agent-Omit, a unified training framework that empowers LLM agents to adaptively omit redundant thoughts and observations. Specifically, we first synthesize a small amount of cold-start data, including both single-turn and multi-turn omission scenarios, to fine-tune the agent for omission behaviors. Furthermore, we introduce an omit-aware agentic reinforcement learning approach, incorporating a dual sampling mechanism and a tailored omission reward to incentivize the agent's adaptive omission capability. Theoretically, we prove that the deviation of our omission policy is upper-bounded by KL-divergence. Experimental results on five agent benchmarks show that our constructed Agent-Omit-8B could obtain performance comparable to seven frontier LLM agent, and achieve the best effectiveness-efficiency trade-off than seven efficient LLM agents methods. Our code and data are available at https://github.com/usail-hkust/Agent-Omit.
Abstract（参考訳）: 多ターンエージェント環境相互作用におけるエージェント思考と観察の管理は,エージェント効率を向上させるための新たな戦略である。しかし、既存の研究では、すべての相互作用の軌跡を等しく扱い、思考の必要性を見越し、観測ユーティリティはターン毎に異なる。この目的のために、まず、思考と観察がエージェントの有効性と効率にどのように影響するかを定量的に調査する。本研究は,LLMエージェントが冗長な思考や観察を適応的に省略することを可能にする統合トレーニングフレームワークであるAgent-Omitを提案する。具体的には, 1ターンおよび複数ターンの脱離シナリオを含む少量の冷間開始データを合成し, 脱離挙動を微調整する。さらに, エージェントの適応的消毒能力を高めるために, 二重サンプリング機構と調整された消毒報酬を組み込んだ Omit-Aware Agentic reinforcement learning 手法を導入する。理論的には, 省略政策の偏差がKL偏差によって上界にあることを証明している。 5つのエージェントベンチマークによる実験結果から, 構築したエージェント-Omit-8Bは, 7つのフロンティアLDMエージェントに匹敵する性能を得ることができ, 7つの効率的なLDMエージェントメソッドよりも高い効率・効率のトレードオフが得られることがわかった。私たちのコードとデータはhttps://github.com/usail-hkust/Agent-Omit.comで公開されています。

論文の概要: Agent-Omit: Training Efficient LLM Agents for Adaptive Thought and Observation Omission via Agentic Reinforcement Learning

関連論文リスト