Fugu-MT 論文翻訳(概要): SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

論文の概要: SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

arxiv url: http://arxiv.org/abs/2508.02085v3
Date: Thu, 07 Aug 2025 16:46:44 GMT
ステータス: 翻訳完了
システム内更新日: 2025-08-08 14:01:14.015177
Title: SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents
Title（参考訳）: SE-Agent: LLMエージェントを用いたマルチステップ推論における自己進化軌道最適化
Authors: Jiaye Lin, Yifu Guo, Yuzhen Han, Sen Hu, Ziyi Ni, Licheng Wang, Mingguang Chen, Daxin Jiang, Binxing Jiao, Chen Hu, Huacan Wang,
Abstract要約: 大規模言語モデル(LLM)ベースのエージェントは、最近、複雑な推論とツールの使用において、環境とのマルチステップのインタラクションを通じて印象的な機能を示した。これらの軌道にはリッチなフィードバックが含まれており、エージェントを正しい方向に誘導して問題を正しく解くことができる。モンテカルロ木探索 (MCTS) のような一般的な手法は、探索と搾取を効果的にバランスさせることができるが、それらは様々な軌道間の相互依存を無視している。エージェントが推論プロセスを反復的に最適化できる自己進化フレームワークSE-Agentを提案する。
参考スコア（独自算出の注目度）: 43.74003959397812
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Model (LLM)-based agents have recently shown impressive capabilities in complex reasoning and tool use via multi-step interactions with their environments. While these agents have the potential to tackle complicated tasks, their problem-solving process, i.e., agents' interaction trajectory leading to task completion, remains underexploited. These trajectories contain rich feedback that can navigate agents toward the right directions for solving problems correctly. Although prevailing approaches, such as Monte Carlo Tree Search (MCTS), can effectively balance exploration and exploitation, they ignore the interdependence among various trajectories and lack the diversity of search spaces, which leads to redundant reasoning and suboptimal outcomes. To address these challenges, we propose SE-Agent, a Self-Evolution framework that enables Agents to optimize their reasoning processes iteratively. Our approach revisits and enhances former pilot trajectories through three key operations: revision, recombination, and refinement. This evolutionary mechanism enables two critical advantages: (1) it expands the search space beyond local optima by intelligently exploring diverse solution paths guided by previous trajectories, and (2) it leverages cross-trajectory inspiration to efficiently enhance performance while mitigating the impact of suboptimal reasoning paths. Through these mechanisms, SE-Agent achieves continuous self-evolution that incrementally improves reasoning quality. We evaluate SE-Agent on SWE-bench Verified to resolve real-world GitHub issues. Experimental results across five strong LLMs show that integrating SE-Agent delivers up to 55% relative improvement, achieving state-of-the-art performance among all open-source agents on SWE-bench Verified. Our code and demonstration materials are publicly available at https://github.com/JARVIS-Xs/SE-Agent.
Abstract（参考訳）: 大規模言語モデル(LLM)ベースのエージェントは、最近、複雑な推論とツールの使用において、環境とのマルチステップのインタラクションを通じて印象的な機能を示した。これらのエージェントは複雑なタスクに対処する可能性があるが、その問題解決プロセス、すなわち、タスク完了につながるエージェントの相互作用軌跡は未解明のままである。これらの軌道にはリッチなフィードバックが含まれており、エージェントを正しい方向に誘導して問題を正しく解くことができる。モンテカルロ木探索 (MCTS) のような一般的なアプローチは、探索と搾取を効果的にバランスさせることができるが、それらは様々な軌道間の相互依存を無視し、探索空間の多様性を欠いているため、冗長な推論と準最適結果をもたらす。これらの課題に対処するために、エージェントが推論プロセスを反復的に最適化することを可能にするSe-EvolutionフレームワークであるSE-Agentを提案する。提案手法は,3つの重要な操作 – リビジョン,リコンビネーション,リファインメント – を通じて,以前のパイロット軌道を再検討し,拡張する。この進化的メカニズムは,(1) 従来の軌道で導かれた多様な解経路をインテリジェントに探索することにより,探索空間を局所最適を超えて拡張し,(2) 最適下推論経路の影響を緩和しつつ,効率よく性能を向上させるために,軌道横断インスピレーションを活用するという2つの重要な利点をもたらす。これらのメカニズムを通じて、SE-Agentは推論品質を漸進的に改善する継続的自己進化を実現する。 SWE-bench VerifiedのSE-Agentを評価し、現実のGitHubの問題を解決する。 5つの強力なLLMでの実験結果から,SE-Agentの統合は,SWE-bench Verified上でのすべてのオープンソースエージェントの最先端性能を達成し,最大55%の相対的な改善をもたらすことが示された。コードとデモ資料はhttps://github.com/JARVIS-Xs/SE-Agent.comで公開されています。

論文の概要: SE-Agent: Self-Evolution Trajectory Optimization in Multi-Step Reasoning with LLM-Based Agents

関連論文リスト