Fugu-MT 論文翻訳(概要): EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

論文の概要: EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

arxiv url: http://arxiv.org/abs/2510.16079v1
Date: Fri, 17 Oct 2025 12:03:16 GMT
ステータス: 翻訳完了
システム内更新日: 2025-10-25 00:56:38.836111
Title: EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle
Title（参考訳）: EvolveR: エクスペリエンス駆動ライフサイクルによる自己進化型LLMエージェント
Authors: Rong Wu, Xiaoman Wang, Jianbiao Mei, Pinlong Cai, Daocheng Fu, Cheng Yang, Licheng Wen, Xuemeng Yang, Yufan Shen, Yuxin Wang, Botian Shi,
Abstract要約: 現在のLLM(Large Language Model)エージェントは、ツール使用時のパフォーマンスは高いが、自身の経験から体系的に学習する能力は欠如している。 EvolveRは、エージェントが完全なクローズドループ体験ライフサイクルを通じて自己改善できるように設計されたフレームワークである。複雑なマルチホップ質問応答ベンチマークにおけるEvolveRの有効性を示す。
参考スコア（独自算出の注目度）: 26.048906477714937
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Current Large Language Model (LLM) agents show strong performance in tool use, but lack the crucial capability to systematically learn from their own experiences. While existing frameworks mainly focus on mitigating external knowledge gaps, they fail to address a more fundamental limitation: the inability to iteratively refine problem-solving strategies. In this work, we introduce EvolveR, a framework designed to enable agent to self-improve through a complete, closed-loop experience lifecycle. This lifecycle comprises two key stages: (1) Offline Self-Distillation, where the agent's interaction trajectories are synthesized into a structured repository of abstract, reusable strategic principles; (2) Online Interaction, where the agent interacts with tasks and actively retrieves distilled principles to guide its decision-making, accumulating a diverse set of behavioral trajectories. This loop employs a policy reinforcement mechanism to iteratively update the agent based on its performance. We demonstrate the effectiveness of EvolveR on complex multi-hop question-answering benchmarks, where it achieves superior performance over strong agentic baselines. Our work presents a comprehensive blueprint for agents that learn not only from external data but also from the consequences of their own actions, paving the way for more autonomous and continuously improving systems. Code is available at https://github.com/Edaizi/EvolveR.
Abstract（参考訳）: 現在のLLM(Large Language Model)エージェントは、ツール使用時のパフォーマンスは高いが、自身の経験から体系的に学習する重要な能力は欠如している。既存のフレームワークは、主に外部の知識ギャップを軽減することに重点を置いているが、より根本的な制限 – 反復的に問題解決戦略を洗練できないこと – には対処できない。本研究では,完全クローズドループ体験ライフサイクルを通じてエージェントの自己改善を可能にするフレームワークであるEvolveRを紹介する。このライフサイクルは,(1) エージェントの相互作用軌跡を抽象的かつ再利用可能な戦略原則の構造化されたリポジトリに合成するオフライン自己蒸留,(2) エージェントがタスクと相互作用し,その意思決定を導くために蒸留原則を積極的に回収し,多様な行動軌跡を蓄積するオンラインインタラクション,の2つの重要な段階から構成される。このループは、ポリシー強化機構を使用して、そのパフォーマンスに基づいてエージェントを反復的に更新する。複雑なマルチホップ質問応答ベンチマークにおけるEvolveRの有効性を示す。我々の研究は、外部データからだけでなく、自身の行動の結果から学習するエージェントに対して包括的な青写真を提供し、より自律的で継続的なシステム改善の道を開く。コードはhttps://github.com/Edaizi/EvolveR.comで入手できる。

論文の概要: EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle

関連論文リスト