Towards General Agentic Intelligence via Environment Scaling
- URL: http://arxiv.org/abs/2509.13311v1
- Date: Tue, 16 Sep 2025 17:57:20 GMT
- Title: Towards General Agentic Intelligence via Environment Scaling
- Authors: Runnan Fang, Shihao Cai, Baixuan Li, Jialong Wu, Guangyu Li, Wenbiao Yin, Xinyu Wang, Xiaobin Wang, Liangcai Su, Zhen Zhang, Shibin Wu, Zhengwei Tao, Yong Jiang, Pengjun Xie, Fei Huang, Jingren Zhou,
- Abstract summary: Advanced agentic intelligence is a prerequisite for deploying Large Language Models in real-world applications.<n>We design a scalable framework that automatically constructs heterogeneous environments that are fully simulated.<n>Experiments on agentic benchmarks, tau-bench, tau2-Bench, and ACEBench, demonstrate that our trained model, AgentScaler, significantly enhances the function-calling capability of models.
- Score: 78.66355092082253
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: Advanced agentic intelligence is a prerequisite for deploying Large Language Models in practical, real-world applications. Diverse real-world APIs demand precise, robust function-calling intelligence, which needs agents to develop these capabilities through interaction in varied environments. The breadth of function-calling competence is closely tied to the diversity of environments in which agents are trained. In this work, we scale up environments as a step towards advancing general agentic intelligence. This gives rise to two central challenges: (i) how to scale environments in a principled manner, and (ii) how to effectively train agentic capabilities from experiences derived through interactions with these environments. To address these, we design a scalable framework that automatically constructs heterogeneous environments that are fully simulated, systematically broadening the space of function-calling scenarios. We further adapt a two-phase agent fine-tuning strategy: first endowing agents with fundamental agentic capabilities, then specializing them for domain-specific contexts. Extensive experiments on agentic benchmarks, tau-bench, tau2-Bench, and ACEBench, demonstrate that our trained model, AgentScaler, significantly enhances the function-calling capability of models.
Related papers
- Prompt-Driven Low-Altitude Edge Intelligence: Modular Agents and Generative Reasoning [20.552503613122067]
Large artificial intelligence models (LAMs) show strong capabilities in perception, reasoning, and multi-modal understanding.<n>The deployment of LAMs at the edge remains constrained by some fundamental limitations.<n>We propose a prompt-to-agent edge cognition framework (P2AECF) to enable the flexible, efficient, and adaptive edge intelligence.
arXiv Detail & Related papers (2026-02-15T06:09:04Z) - Agent World Model: Infinity Synthetic Environments for Agentic Reinforcement Learning [62.499592503950026]
Large language model (LLM) have empowered autonomous agents to perform complex tasks that require multi-turn interactions with tools and environments.<n>We propose Agent World Model (AWM), a fully synthetic environment generation pipeline.<n>We scale to 1,000 environments covering everyday scenarios, in which agents can interact with rich toolsets.
arXiv Detail & Related papers (2026-02-10T18:55:41Z) - TongSIM: A General Platform for Simulating Intelligent Machines [59.27575233453533]
Embodied intelligence focuses on training agents within realistic simulated environments.<n>TongSIM is a high-fidelity, general-purpose platform for training and evaluating embodied agents.
arXiv Detail & Related papers (2025-12-23T10:00:43Z) - ARE: Scaling Up Agent Environments and Evaluations [22.98982051873728]
We introduce Meta Agents Research Environments (ARE), a research platform for scalable creation of environments.<n>ARE provides simple abstractions to build complex and diverse environments, each with their own rules, tools, content, and verifiers.<n>We also propose Gaia2, a benchmark built in ARE and designed to measure general agent capabilities.
arXiv Detail & Related papers (2025-09-21T16:59:45Z) - AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications [95.42093979627703]
AgentScope supports flexible and efficient tool-based agent-environment interactions.<n>We ground agent behaviors in the ReAct paradigm and offer advanced agent-level infrastructure.<n>AgentScope also includes robust engineering support for developer-friendly experiences.
arXiv Detail & Related papers (2025-08-22T10:35:56Z) - A Survey on Agent Workflow -- Status and Future [2.817843718857682]
This survey provides a comprehensive review of agent workflow systems.<n>We classify existing systems along two key dimensions: functional capabilities and architectural features.<n>We highlight common patterns, potential technical challenges, and emerging trends.
arXiv Detail & Related papers (2025-08-02T04:15:30Z) - A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence [87.08051686357206]
Large Language Models (LLMs) have demonstrated strong capabilities but remain fundamentally static.<n>As LLMs are increasingly deployed in open-ended, interactive environments, this static nature has become a critical bottleneck.<n>This survey provides the first systematic and comprehensive review of self-evolving agents.
arXiv Detail & Related papers (2025-07-28T17:59:05Z) - Mind the Gap: Towards Generalizable Autonomous Penetration Testing via Domain Randomization and Meta-Reinforcement Learning [15.619925926862235]
GAP is a generalizable autonomous pentesting framework.<n>It aims to realizes efficient policy training in realistic environments.<n>It also trains agents capable of drawing inferences about other cases from one instance.
arXiv Detail & Related papers (2024-12-05T11:24:27Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.