Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey
- URL: http://arxiv.org/abs/2511.09586v1
- Date: Fri, 14 Nov 2025 01:00:47 GMT
- Title: Scaling Environments for LLM Agents in the Era of Learning from Interaction: A Survey
- Authors: Yuchen Huang, Sijia Li, Minghao Liu, Wei Liu, Shijue Huang, Zhiyuan Fan, Hou Pong Chan, Yi R. Fung,
- Abstract summary: A growing consensus is that agents should interact directly with environments and learn from experience through reinforcement learning.<n>We formalize this iterative process as the Generation-Execution-Feedback (GEF) loop, where environments generate tasks to challenge agents, return observations in response to agents' actions during task execution, and provide evaluative feedback on rollouts for subsequent learning.<n>Under this paradigm, environments function as indispensable producers of experiential data, highlighting the need to scale them toward greater complexity, realism, and interactivity.
- Score: 30.673419015614233
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: LLM-based agents can autonomously accomplish complex tasks across various domains. However, to further cultivate capabilities such as adaptive behavior and long-term decision-making, training on static datasets built from human-level knowledge is insufficient. These datasets are costly to construct and lack both dynamism and realism. A growing consensus is that agents should instead interact directly with environments and learn from experience through reinforcement learning. We formalize this iterative process as the Generation-Execution-Feedback (GEF) loop, where environments generate tasks to challenge agents, return observations in response to agents' actions during task execution, and provide evaluative feedback on rollouts for subsequent learning. Under this paradigm, environments function as indispensable producers of experiential data, highlighting the need to scale them toward greater complexity, realism, and interactivity. In this survey, we systematically review representative methods for environment scaling from a pioneering environment-centric perspective and organize them along the stages of the GEF loop, namely task generation, task execution, and feedback. We further analyze benchmarks, implementation strategies, and applications, consolidating fragmented advances and outlining future research directions for agent intelligence.
Related papers
- Autonomous Continual Learning of Computer-Use Agents for Environment Adaptation [57.65688895630163]
We introduce ACuRL, an Autonomous Curriculum Reinforcement Learning framework that continually adapts agents to specific environments with zero human data.<n>Our method effectively enables both intra-environment and cross-environment continual learning, yielding 4-22% performance gains without forgetting existing environments.
arXiv Detail & Related papers (2026-02-10T23:06:02Z) - AgentLongBench: A Controllable Long Benchmark For Long-Contexts Agents via Environment Rollouts [78.33143446024485]
We introduce textbfAgentLongBench, which evaluates agents through simulated environment rollouts based on Lateral Thinking Puzzles.<n>This framework generates rigorous interaction trajectories across knowledge-intensive and knowledge-free scenarios.
arXiv Detail & Related papers (2026-01-28T16:05:44Z) - Grounded Test-Time Adaptation for LLM Agents [75.62784644919803]
Large language model (LLM)-based agents struggle to generalize to novel and complex environments.<n>We propose two strategies for adapting LLM agents by leveraging environment-specific information available during deployment.
arXiv Detail & Related papers (2025-11-06T22:24:35Z) - The Landscape of Agentic Reinforcement Learning for LLMs: A Survey [103.32591749156416]
The emergence of agentic reinforcement learning (Agentic RL) marks a paradigm shift from conventional reinforcement learning applied to large language models (LLM RL)<n>This survey formalizes this conceptual shift by contrasting the degenerate single-step Markov Decision Processes (MDPs) of LLM-RL with the temporally extended, partially observable Markov decision processes (POMDPs) that define Agentic RL.
arXiv Detail & Related papers (2025-09-02T17:46:26Z) - InfoDeepSeek: Benchmarking Agentic Information Seeking for Retrieval-Augmented Generation [63.55258191625131]
InfoDeepSeek is a new benchmark for assessing agentic information seeking in real-world, dynamic web environments.<n>We propose a systematic methodology for constructing challenging queries satisfying the criteria of determinacy, difficulty, and diversity.<n>We develop the first evaluation framework tailored to dynamic agentic information seeking, including fine-grained metrics about the accuracy, utility, and compactness of information seeking outcomes.
arXiv Detail & Related papers (2025-05-21T14:44:40Z) - MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering [57.156093929365255]
Gym-style framework for systematically reinforcement learning, evaluating, and improving autonomous large language model (LLM) agents.<n>MLE-Dojo covers diverse, open-ended MLE tasks carefully curated to reflect realistic engineering scenarios.<n>Its fully executable environment supports comprehensive agent training via both supervised fine-tuning and reinforcement learning.
arXiv Detail & Related papers (2025-05-12T17:35:43Z) - Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments [33.83610929282721]
Learn-by-interact is a data-centric framework to adapt large language models (LLMs) to any given environments without human annotations.<n>We assess the quality of our synthetic data by using them in both training-based scenarios and training-free in-context learning (ICL)<n>Experiments on SWE-bench, WebArena, OSWorld and Spider2-V spanning across realistic coding, web, and desktop environments show the effectiveness of Learn-by-interact.
arXiv Detail & Related papers (2025-01-18T22:34:41Z) - Agentic Retrieval-Augmented Generation: A Survey on Agentic RAG [0.8463972278020965]
Large Language Models (LLMs) have revolutionized artificial intelligence (AI) by enabling human like text generation and natural language understanding.<n>Retrieval Augmented Generation (RAG) has emerged as a solution, enhancing LLMs by integrating real time data retrieval to provide contextually relevant responses.<n>Agentic Retrieval-Augmented Generation (RAG) transcends these limitations by embedding autonomous AI agents into the RAG pipeline.
arXiv Detail & Related papers (2025-01-15T20:40:25Z) - Logical Specifications-guided Dynamic Task Sampling for Reinforcement Learning Agents [9.529492371336286]
Reinforcement Learning (RL) has made significant strides in enabling artificial agents to learn diverse behaviors.
We propose a novel approach, called Logical Specifications-guided Dynamic Task Sampling (LSTS)
LSTS learns a set of RL policies to guide an agent from an initial state to a goal state based on a high-level task specification.
arXiv Detail & Related papers (2024-02-06T04:00:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.