Fugu-MT 論文翻訳(概要): $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation

論文の概要: $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation

arxiv url: http://arxiv.org/abs/2509.13368v1
Date: Tue, 16 Sep 2025 02:14:39 GMT
ステータス: 翻訳完了
システム内更新日: 2025-09-18 18:41:50.56675
Title: $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation
Title（参考訳）: $Agent^2$:強化学習自動化のためのエージェント生成エージェントフレームワーク
Authors: Yuan Wei, Xiaohan Shan, Ran Miao, Jianmin Li,
Abstract要約: Agent2$は、完全に自動化されたRLエージェント設計を実現する新しいエージェント生成エージェントフレームワークである。このフレームワークはRL開発を、MDPモデリングとアルゴリズム最適化の2つの異なる段階に分解する。 MuJoCo、MetaDrive、MPE、SMACを含む幅広いベンチマークの実験では、$Agent2$が手作業で設計したソリューションより一貫して優れていることが示されている。
参考スコア（独自算出の注目度）: 5.325886106098561
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning agent development traditionally requires extensive expertise and lengthy iterations, often resulting in high failure rates and limited accessibility. This paper introduces $Agent^2$, a novel agent-generates-agent framework that achieves fully automated RL agent design through intelligent LLM-driven generation. The system autonomously transforms natural language task descriptions and environment code into comprehensive, high-performance reinforcement learning solutions without human intervention. $Agent^2$ features a revolutionary dual-agent architecture. The Generator Agent serves as an autonomous AI designer that analyzes tasks and generates executable RL agents, while the Target Agent is the resulting automatically generated RL agent. The framework decomposes RL development into two distinct stages: MDP modeling and algorithmic optimization, enabling more targeted and effective agent generation. Built on the Model Context Protocol, $Agent^2$ provides a unified framework that standardizes intelligent agent creation across diverse environments and algorithms, while incorporating adaptive training management and intelligent feedback analysis for continuous improvement. Extensive experiments on a wide range of benchmarks, including MuJoCo, MetaDrive, MPE, and SMAC, demonstrate that $Agent^2$ consistently outperforms manually designed solutions across all tasks, achieving up to 55% performance improvement and substantial gains on average. By enabling truly end-to-end, closed-loop automation, this work establishes a new paradigm in which intelligent agents design and optimize other agents, marking a fundamental breakthrough for automated AI systems.
Abstract（参考訳）: 強化学習エージェントの開発は伝統的に広範囲の専門知識と長いイテレーションを必要とし、しばしば高い失敗率と限られたアクセシビリティをもたらす。本稿では, エージェント生成エージェントフレームワークである$Agent^2$を紹介し, 知的LLM駆動型生成による完全自動RLエージェント設計を実現する。このシステムは、自然言語のタスク記述と環境コードを、人間の介入なしに包括的で高性能な強化学習ソリューションに自律的に変換する。 Agent^2$は革命的なデュアルエージェントアーキテクチャである。 Generator Agentはタスクを分析して実行可能なRLエージェントを生成する自律AIデザイナとして機能し、Target Agentは自動生成されたRLエージェントである。このフレームワークはRL開発を、MDPモデリングとアルゴリズム最適化という2つの異なる段階に分解し、よりターゲットを絞って効果的なエージェント生成を可能にする。 Model Context Protocol上に構築された$Agent^2$は、さまざまな環境やアルゴリズムでインテリジェントなエージェント生成を標準化する統合フレームワークを提供すると同時に、適応的なトレーニング管理とインテリジェントなフィードバック分析を継続的改善に取り入れている。 MuJoCo、MetaDrive、MPE、SMACを含む幅広いベンチマークに関する大規模な実験は、$Agent^2$がすべてのタスクで手作業で設計されたソリューションを一貫して上回り、パフォーマンスの改善を最大55%達成し、平均的に大幅に向上することを示した。この作業は、真にエンドツーエンドのクローズドループ自動化を可能にすることで、インテリジェントエージェントが他のエージェントを設計、最適化する新たなパラダイムを確立し、自動化されたAIシステムの根本的なブレークスルーを示す。

論文の概要: $Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation

関連論文リスト