Building Self-Evolving Agents via Experience-Driven Lifelong Learning: A Framework and Benchmark
- URL: http://arxiv.org/abs/2508.19005v4
- Date: Fri, 12 Sep 2025 05:22:00 GMT
- Title: Building Self-Evolving Agents via Experience-Driven Lifelong Learning: A Framework and Benchmark
- Authors: Yuxuan Cai, Yipeng Hao, Jie Zhou, Hang Yan, Zhikai Lei, Rui Zhen, Zhenhua Han, Yutao Yang, Junsong Li, Qianjun Pan, Tianyu Huai, Qin Chen, Xin Li, Kai Chen, Bo Zhang, Xipeng Qiu, Liang He,
- Abstract summary: We introduce Experience-driven Lifelong Learning (ELL), a framework for building self-evolving agents.<n>ELL is built on four core principles: Experience Exploration, Long-term Memory, Skill Learning and Knowledge Internalization.<n>We also introduce StuLife, a benchmark dataset for ELL that simulates a student's holistic college journey.
- Score: 57.59000694149105
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As AI advances toward general intelligence, the focus is shifting from systems optimized for static tasks to creating open-ended agents that learn continuously. In this paper, we introduce Experience-driven Lifelong Learning (ELL), a framework for building self-evolving agents capable of continuous growth through real-world interaction. The framework is built on four core principles: (1) Experience Exploration: Agents learn through continuous, self-motivated interaction with dynamic environments, navigating interdependent tasks and generating rich experiential trajectories. (2) Long-term Memory: Agents preserve and structure historical knowledge, including personal experiences, domain expertise, and commonsense reasoning, into a persistent memory system. (3) Skill Learning: Agents autonomously improve by abstracting recurring patterns from experience into reusable skills, which are actively refined and validated for application in new tasks. (4) Knowledge Internalization: Agents internalize explicit and discrete experiences into implicit and intuitive capabilities as "second nature". We also introduce StuLife, a benchmark dataset for ELL that simulates a student's holistic college journey, from enrollment to academic and personal development, across three core phases and ten detailed sub-scenarios. StuLife is designed around three key paradigm
Related papers
- The Agent's First Day: Benchmarking Learning, Exploration, and Scheduling in the Workplace Scenarios [34.25281365374991]
We introduce method, a dynamic evaluation environment that simulates a "trainee" agent continuously exploring a novel setting.<n>Unlike traditional benchmarks, method evaluates agents along three dimensions: (1) context-aware scheduling for streaming tasks with varying priorities; (2) prudent information acquisition to reduce hallucination via active exploration; and (3) continuous evolution by distilling generalized strategies from rule-based, dynamically generated tasks.<n>Our work establishes a framework for assessing agent reliability, shifting evaluation from static tests to realistic, production-oriented scenarios.
arXiv Detail & Related papers (2026-01-13T03:09:18Z) - EvolveR: Self-Evolving LLM Agents through an Experience-Driven Lifecycle [26.048906477714937]
Current Large Language Model (LLM) agents show strong performance in tool use, but lack the capability to systematically learn from their own experiences.<n>We introduce EvolveR, a framework designed to enable agent to self-improve through a complete, closed-loop experience lifecycle.<n>We demonstrate the effectiveness of EvolveR on complex multi-hop question-answering benchmarks, where it achieves superior performance over strong agentic baselines.
arXiv Detail & Related papers (2025-10-17T12:03:16Z) - Evolution in Simulation: AI-Agent School with Dual Memory for High-Fidelity Educational Dynamics [10.185612854120627]
Large language models (LLMs) based Agents are increasingly pivotal in simulating and understanding complex human systems and interactions.<n>We propose the AI-Agent School (AAS) system, built around a self-evolving mechanism that leverages agents for simulating complex educational dynamics.
arXiv Detail & Related papers (2025-10-13T11:27:53Z) - Agent Learning via Early Experience [93.83579011718858]
A long-term goal of language agents is to learn and improve through their own experience, ultimately outperforming humans in complex, real-world tasks.<n>Most current agents rely on supervised fine-tuning on expert data, which is challenging to scale and generalizes poorly.<n>We study two strategies of using such data: (1) Implicit world modeling, which uses collected states to ground the policy in environment dynamics; and (2) Self-reflection, where the agent learns from its suboptimal actions to improve reasoning and decision-making.
arXiv Detail & Related papers (2025-10-09T17:59:17Z) - Learning on the Job: An Experience-Driven Self-Evolving Agent for Long-Horizon Tasks [42.78572295558531]
Large Language Models have demonstrated remarkable capabilities across diverse domains, yet significant challenges persist when deploying them as AI agents for real-world long-horizon tasks.<n>Existing LLM agents suffer from a critical limitation: they are test-time static and cannot learn from experience, lacking the ability to accumulate knowledge and continuously improve on the job.<n>We propose MUSE, a novel agent framework that introduces an experience-driven, self-evolving system centered around a hierarchical Memory Module.
arXiv Detail & Related papers (2025-10-09T09:40:34Z) - ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory [57.517214479414726]
ReasoningBank is a memory framework that distills generalizable reasoning strategies from an agent's self-judged successful and failed experiences.<n>At test time, an agent retrieves relevant memories from ReasoningBank to inform its interaction and then integrates new learnings back, enabling it to become more capable over time.<n>We introduce memory-aware test-time scaling (MaTTS), which accelerates and diversifies this learning process by scaling up the agent's interaction experience.
arXiv Detail & Related papers (2025-09-29T17:51:03Z) - LifelongAgentBench: Evaluating LLM Agents as Lifelong Learners [51.518410910148816]
Current large language model (LLM)-based agents, however, remain stateless and unable to accumulate or transfer knowledge over time.<n>We present LifelongAgentBench, the first unified benchmark designed to systematically assess the lifelong learning ability of LLM agents.
arXiv Detail & Related papers (2025-05-17T10:09:11Z) - SPECI: Skill Prompts based Hierarchical Continual Imitation Learning for Robot Manipulation [3.1997825444285457]
Real-world robot manipulation in dynamic unstructured environments requires lifelong adaptability to evolving objects, scenes and tasks.<n>Traditional imitation learning relies on static training paradigms, which are ill-suited for lifelong adaptation.<n>We propose Skill Prompts-based HiErarchical Continual Imitation Learning (SPECI), a novel end-to-end hierarchical CIL policy architecture for robot manipulation.
arXiv Detail & Related papers (2025-04-22T03:30:38Z) - Knowledge Retention for Continual Model-Based Reinforcement Learning [11.5581880507344]
DRAGO is a novel approach for continual model-based reinforcement learning.<n>DRAGO comprises two key components: Synthetic Experience Rehearsal and Regaining Memories Through Exploration.<n> Empirical evaluations demonstrate that DRAGO is able to preserve knowledge across tasks, achieving superior performance in various continual learning scenarios.
arXiv Detail & Related papers (2025-03-06T09:38:14Z) - Towards LifeSpan Cognitive Systems [94.8985839251011]
Building a human-like system that continuously interacts with complex environments presents several key challenges.<n>We refer to this envisioned system as the LifeSpan Cognitive System (LSCS)<n>A critical feature of LSCS is its ability to engage in incremental and rapid updates while retaining and accurately recalling past experiences.
arXiv Detail & Related papers (2024-09-20T06:54:00Z) - Online Continual Learning For Interactive Instruction Following Agents [20.100312650193228]
We argue that such a learning scenario is less realistic since a robotic agent is supposed to learn the world continuously as it explores and perceives it.
We propose two continual learning setups for embodied agents; learning new behaviors and new environments.
arXiv Detail & Related papers (2024-03-12T11:33:48Z) - Recall-Oriented Continual Learning with Generative Adversarial
Meta-Model [5.710971447109951]
We propose a recall-oriented continual learning framework to address the stability-plasticity dilemma.
Inspired by the human brain's ability to separate the mechanisms responsible for stability and plasticity, our framework consists of a two-level architecture.
We show that our framework not only effectively learns new knowledge without any disruption but also achieves high stability of previous knowledge.
arXiv Detail & Related papers (2024-03-05T16:08:59Z) - LIBERO: Benchmarking Knowledge Transfer for Lifelong Robot Learning [64.55001982176226]
LIBERO is a novel benchmark of lifelong learning for robot manipulation.
We focus on how to efficiently transfer declarative knowledge, procedural knowledge, or the mixture of both.
We develop an extendible procedural generation pipeline that can in principle generate infinitely many tasks.
arXiv Detail & Related papers (2023-06-05T23:32:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.