ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
- URL: http://arxiv.org/abs/2509.25140v1
- Date: Mon, 29 Sep 2025 17:51:03 GMT
- Title: ReasoningBank: Scaling Agent Self-Evolving with Reasoning Memory
- Authors: Siru Ouyang, Jun Yan, I-Hung Hsu, Yanfei Chen, Ke Jiang, Zifeng Wang, Rujun Han, Long T. Le, Samira Daruki, Xiangru Tang, Vishy Tirumalashetty, George Lee, Mahsan Rofouei, Hangfei Lin, Jiawei Han, Chen-Yu Lee, Tomas Pfister,
- Abstract summary: ReasoningBank is a memory framework that distills generalizable reasoning strategies from an agent's self-judged successful and failed experiences.<n>At test time, an agent retrieves relevant memories from ReasoningBank to inform its interaction and then integrates new learnings back, enabling it to become more capable over time.<n>We introduce memory-aware test-time scaling (MaTTS), which accelerates and diversifies this learning process by scaling up the agent's interaction experience.
- Score: 57.517214479414726
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the growing adoption of large language model agents in persistent real-world roles, they naturally encounter continuous streams of tasks. A key limitation, however, is their failure to learn from the accumulated interaction history, forcing them to discard valuable insights and repeat past errors. We propose ReasoningBank, a novel memory framework that distills generalizable reasoning strategies from an agent's self-judged successful and failed experiences. At test time, an agent retrieves relevant memories from ReasoningBank to inform its interaction and then integrates new learnings back, enabling it to become more capable over time. Building on this powerful experience learner, we further introduce memory-aware test-time scaling (MaTTS), which accelerates and diversifies this learning process by scaling up the agent's interaction experience. By allocating more compute to each task, the agent generates abundant, diverse experiences that provide rich contrastive signals for synthesizing higher-quality memory. The better memory in turn guides more effective scaling, establishing a powerful synergy between memory and test-time scaling. Across web browsing and software engineering benchmarks, ReasoningBank consistently outperforms existing memory mechanisms that store raw trajectories or only successful task routines, improving both effectiveness and efficiency; MaTTS further amplifies these gains. These findings establish memory-driven experience scaling as a new scaling dimension, enabling agents to self-evolve with emergent behaviors naturally arise.
Related papers
- Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction [35.20324450282101]
We show that an agent's reliance on memory can be modeled as an explicit and user-controllable dimension.<n>We propose textbfSteerable textbfMemory Agent, textttSteeM, a framework that allows users to dynamically regulate memory reliance.
arXiv Detail & Related papers (2026-01-08T16:54:30Z) - Memory in the Age of AI Agents [217.9368190980982]
This work aims to provide an up-to-date landscape of current agent memory research.<n>We identify three dominant realizations of agent memory, namely token-level, parametric, and latent memory.<n>To support practical development, we compile a comprehensive summary of memory benchmarks and open-source frameworks.
arXiv Detail & Related papers (2025-12-15T17:22:34Z) - MemVerse: Multimodal Memory for Lifelong Learning Agents [35.218549149012844]
We introduce MemVerse, a model-agnostic, plug-and-play memory framework.<n>MemVerse bridges fast parametric recall with hierarchical retrieval-based memory.<n>It enables scalable and adaptive multimodal intelligence.
arXiv Detail & Related papers (2025-12-03T10:06:14Z) - Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory [89.65731902036669]
Evo-Memory is a streaming benchmark and framework for evaluating self-evolving memory in large language model (LLM) agents.<n>We evaluate over ten representative memory modules and evaluate them across 10 diverse multi-turn goal-oriented and single-turn reasoning and QA datasets.
arXiv Detail & Related papers (2025-11-25T21:08:07Z) - MemGen: Weaving Generative Latent Memory for Self-Evolving Agents [57.1835920227202]
We propose MemGen, a dynamic generative memory framework that equips agents with a human-esque cognitive faculty.<n>MemGen enables agents to recall and augment latent memory throughout reasoning, producing a tightly interwoven cycle of memory and cognition.
arXiv Detail & Related papers (2025-09-29T12:33:13Z) - Memory Management and Contextual Consistency for Long-Running Low-Code Agents [0.0]
This paper proposes a novel hybrid memory system designed specifically for LCNC agents.<n>Inspired by cognitive science, our architecture combines episodic and semantic memory components with a proactive "Intelligent Decay" mechanism.<n>Key innovation is a user-centric visualization interface, aligned with the LCNC paradigm, which allows non-technical users to manage the agent's memory directly.
arXiv Detail & Related papers (2025-09-27T08:01:26Z) - Memp: Exploring Agent Procedural Memory [72.41472703974935]
Large Language Models (LLMs) based agents excel at diverse tasks, yet they suffer from brittle procedural memory that is manually engineered or entangled in static parameters.<n>We propose Memp that distills past agent trajectories into both fine-grained, step-by-step instructions and higher-level, script-like abstractions.<n>We show that as the memory repository is refined, agents achieve steadily higher success rates and greater efficiency on analogous tasks.
arXiv Detail & Related papers (2025-08-08T16:20:56Z) - How Memory Management Impacts LLM Agents: An Empirical Study of Experience-Following Behavior [49.62361184944454]
Memory is a critical component in large language model (LLM)-based agents.<n>We study how memory management choices impact the LLM agents' behavior, especially their long-term performance.
arXiv Detail & Related papers (2025-05-21T22:35:01Z) - From RAG to Memory: Non-Parametric Continual Learning for Large Language Models [6.380729797938521]
retrieval-augmented generation (RAG) has become the dominant way to introduce new information.<n>Recent RAG approaches augment vector embeddings with various structures like knowledge graphs to address some gaps, namely sense-making and associativity.<n>We propose HippoRAG 2, a framework that outperforms standard RAG comprehensively on factual, sense-making, and associative memory tasks.
arXiv Detail & Related papers (2025-02-20T18:26:02Z) - Memory Sharing for Large Language Model based Agents [43.53494041932615]
This paper introduces the Memory Sharing, a framework which integrates the real-time memory filter, storage and retrieval to enhance the In-Context Learning process.
The experimental results demonstrate that the MS framework significantly improves the agents' performance in addressing open-ended questions.
arXiv Detail & Related papers (2024-04-15T17:57:30Z) - Recurrent Action Transformer with Memory [39.58317527488534]
This paper proposes a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention.
We conduct experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments.
The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments.
arXiv Detail & Related papers (2023-06-15T19:29:08Z) - Quantum adaptive agents with efficient long-term memories [0.0]
More information the agent must recall from its past experiences, the more memory it will need.
We uncover the most general form a quantum agent need adopt to maximise memory compression advantages.
We show these encodings can exhibit extremely favourable scaling advantages relative to memory-minimal classical agents.
arXiv Detail & Related papers (2021-08-24T17:57:05Z) - Augmented Replay Memory in Reinforcement Learning With Continuous
Control [1.6752182911522522]
Online reinforcement learning agents are currently able to process an increasing amount of data by converting it into a higher order value functions.
This expansion increases the agent's state space enabling it to scale up to a more complex problems but also increases the risk of forgetting by learning on redundant or conflicting data.
To improve the approximation of a large amount of data, a random mini-batch of the past experiences that are stored in the replay memory buffer is often replayed at each learning step.
arXiv Detail & Related papers (2019-12-29T20:07:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.