Fugu-MT 論文翻訳(概要): Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey

論文の概要: Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey

arxiv url: http://arxiv.org/abs/2511.09586v1
Date: Fri, 14 Nov 2025 01:00:47 GMT
ステータス: 翻訳完了
システム内更新日: 2025-11-14 22:53:22.352387
Title: Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey
Title（参考訳）: 相互作用から学ぶLLMエージェントのスケーリング環境:調査
Authors: Yuchen Huang, Sijia Li, Minghao Liu, Wei Liu, Shijue Huang, Zhiyuan Fan, Hou Pong Chan, Yi R. Fung,
Abstract要約: エージェントは環境と直接対話し、強化学習を通じて経験から学ぶべきだという意見が高まりつつある。本稿では,この反復処理をGEFループとして定式化し,環境がエージェントに挑戦するためのタスクを生成し,タスク実行中のエージェントの動作に応答して観察を返却し,その後の学習のためのロールアウトに対する評価フィードバックを提供する。このパラダイムの下では、環境は経験的データの必須生産元として機能し、より複雑な、現実主義、対話性へのスケールの必要性を強調している。
参考スコア（独自算出の注目度）: 30.673419015614233
License: http://creativecommons.org/licenses/by/4.0/
Abstract: LLM-based agents can autonomously accomplish complex tasks across various domains. However, to further cultivate capabilities such as adaptive behavior and long-term decision-making, training on static datasets built from human-level knowledge is insufficient. These datasets are costly to construct and lack both dynamism and realism. A growing consensus is that agents should instead interact directly with environments and learn from experience through reinforcement learning. We formalize this iterative process as the Generation-Execution-Feedback (GEF) loop, where environments generate tasks to challenge agents, return observations in response to agents' actions during task execution, and provide evaluative feedback on rollouts for subsequent learning. Under this paradigm, environments function as indispensable producers of experiential data, highlighting the need to scale them toward greater complexity, realism, and interactivity. In this survey, we systematically review representative methods for environment scaling from a pioneering environment-centric perspective and organize them along the stages of the GEF loop, namely task generation, task execution, and feedback. We further analyze benchmarks, implementation strategies, and applications, consolidating fragmented advances and outlining future research directions for agent intelligence.
Abstract（参考訳）: LLMベースのエージェントは、様々な領域にわたる複雑なタスクを自律的に達成することができる。しかし、適応行動や長期的な意思決定などの能力をさらに育成するためには、人間レベルの知識から構築された静的データセットのトレーニングは不十分である。これらのデータセットは、ダイナミズムとリアリズムの両方を構築し、欠落させるのに費用がかかる。エージェントは環境と直接対話し、強化学習を通じて経験から学ぶべきだという意見が高まりつつある。本稿では,この反復処理をGEFループとして定式化し,環境がエージェントに挑戦するためのタスクを生成し,タスク実行中のエージェントの動作に応答して観察を返却し,その後の学習のためのロールアウトに対する評価フィードバックを提供する。このパラダイムの下では、環境は経験的データの必須生産元として機能し、それらをより複雑な、現実主義、対話性へと拡張する必要性を強調している。本研究では,環境中心の視点から環境スケーリングの代表的手法を体系的に検討し,GEFループの段階(タスク生成,タスク実行,フィードバック)に沿ってそれらを整理する。さらに、ベンチマーク、実装戦略、アプリケーションを分析し、断片化された進歩を統合し、エージェントインテリジェンスの今後の研究方向性を概説する。

論文の概要: Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey

関連論文リスト