Related papers: Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning

URL: http://arxiv.org/abs/2511.16043v1
Date: Thu, 20 Nov 2025 05:01:57 GMT
Title: Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning
Authors: Peng Xia, Kaide Zeng, Jiaqi Liu, Can Qin, Fang Wu, Yiyang Zhou, Caiming Xiong, Huaxiu Yao,
Abstract summary: Large Language Model (LLM) Agents are constrained by a dependency on human-curated data.<n>We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data.<n>Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks.
Score: 84.70211451226835
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Large Language Model (LLM) Agents, often trained with Reinforcement Learning (RL), are constrained by a dependency on human-curated data, limiting scalability and tethering AI to human knowledge. Existing self-evolution frameworks offer an alternative but are typically restricted by the model's inherent capabilities and single-round interactions, hindering the development of complex curricula involving tool use or dynamic reasoning. We introduce Agent0, a fully autonomous framework that evolves high-performing agents without external data through multi-step co-evolution and seamless tool integration. Agent0 establishes a symbiotic competition between two agents initialized from the same base LLM: a curriculum agent that proposes increasingly challenging frontier tasks, and an executor agent that learns to solve them. We integrate external tools to enhance the executor's problem-solving capacity; this improvement, in turn, pressures the curriculum agent to construct more complex, tool-aware tasks. Through this iterative process, Agent0 establishes a self-reinforcing cycle that continuously produces high-quality curricula. Empirically, Agent0 substantially boosts reasoning capabilities, improving the Qwen3-8B-Base model by 18% on mathematical reasoning and 24% on general reasoning benchmarks. Code is available at https://github.com/aiming-lab/Agent0.

Related papers

Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data [49.315842374696295]
Large language models (LLMs) are becoming the foundation for autonomous agents that can use tools to solve complex tasks.<n>We propose Tool-R0 framework for training general purpose tool-calling agents from scratch with self-play RL.<n>Our work further provides empirical insights into self-play LLM agents by analyzing co-evolution, curriculum dynamics, and scaling behavior.
arXiv Detail & Related papers (2026-02-24T19:41:18Z)
From Agentification to Self-Evolving Agentic AI for Wireless Networks: Concepts, Approaches, and Future Research Directions [70.72279728350763]
Self-evolving agentic artificial intelligence (AI) offers a new paradigm for future wireless systems.<n>Unlike static AI models, self-evolving agents embed an autonomous evolution cycle that updates models, tools, and in response to environmental dynamics.<n>This paper presents a comprehensive overview of self-evolving agentic AI, highlighting its layered architecture, life cycle, and key techniques.
arXiv Detail & Related papers (2025-10-07T05:45:25Z)
$Agent^2$: An Agent-Generates-Agent Framework for Reinforcement Learning Automation [5.325886106098561]
Reinforcement learning (RL) agent development traditionally requires substantial expertise and iterative effort.<n>This paper introduces Agent$2$, an LLM-driven agent-generates-agent framework for fully automated RL agent design.<n>Agent$2$ translates natural language task descriptions and environment code into executable RL solutions without human intervention.
arXiv Detail & Related papers (2025-09-16T02:14:39Z)
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning [129.44038804430542]
We introduce AgentGym-RL, a new framework to train LLM agents for multi-turn interactive decision-making through RL.<n>We propose ScalingInter-RL, a training approach designed for exploration-exploitation balance and stable RL optimization.<n>Our agents match or surpass commercial models on 27 tasks across diverse environments.
arXiv Detail & Related papers (2025-09-10T16:46:11Z)
SEAgent: Self-Evolving Computer Use Agent with Autonomous Learning from Experience [71.82719117238307]
We propose SEAgent, an agentic self-evolving framework enabling computer-use agents to evolve through interactions with unfamiliar software.<n>We validate the effectiveness of SEAgent across five novel software environments within OS-World.<n>Our approach achieves a significant improvement of 23.2% in success rate, from 11.3% to 34.5%, over a competitive open-source CUA.
arXiv Detail & Related papers (2025-08-06T17:58:46Z)
Agent Lightning: Train ANY AI Agents with Reinforcement Learning [24.13422767414729]
We present Agent Lightning, a framework that enables Reinforcement Learning (RL)-based training of Large Language Models (LLMs) for any AI agent.<n>By formulating agent execution as Markov decision process, we define an unified data interface and propose a hierarchical RL algorithm, LightningRL, which contains a credit assignment module.<n>For the system design, we introduce a Training-Agent Disaggregation architecture, and brings agent observability frameworks into agent runtime.
arXiv Detail & Related papers (2025-08-05T17:50:13Z)
Gödel Agent: A Self-Referential Agent Framework for Recursive Self-Improvement [112.04307762405669]
G"odel Agent is a self-evolving framework inspired by the G"odel machine.<n>G"odel Agent can achieve continuous self-improvement, surpassing manually crafted agents in performance, efficiency, and generalizability.
arXiv Detail & Related papers (2024-10-06T10:49:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.