Related papers: Memento 2: Learning by Stateful Reflective Memory

Memento 2: Learning by Stateful Reflective Memory

URL: http://arxiv.org/abs/2512.22716v2
Date: Wed, 31 Dec 2025 23:24:05 GMT
Title: Memento 2: Learning by Stateful Reflective Memory
Authors: Jun Wang,
Abstract summary: We study continual learning in large language model (LLM) based agents that integrate episodic memory with reinforcement learning.<n>We focus on reflection, the ability of an agent to revisit past experience and adjust how it selects future actions.<n>We introduce the Stateful Reflective Decision Process (SRDP), in which an agent maintains and updates episodic memory and alternates between writing new experiences to memory and reading relevant cases to guide decisions.
Score: 4.7052412989773975
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: We study continual learning in large language model (LLM) based agents that integrate episodic memory with reinforcement learning. We focus on reflection, the ability of an agent to revisit past experience and adjust how it selects future actions, as the central mechanism for continual adaptation without fine tuning model weights. To formalise this, we introduce the Stateful Reflective Decision Process (SRDP), in which an agent maintains and updates episodic memory and alternates between writing new experiences to memory and reading relevant cases to guide decisions. This framework casts reflective memory dynamics as part of the decision process itself and makes them amenable to control and learning analysis. Building on this formulation, we develop a Read-Write Reflective Learning algorithm that incorporates memory retrieval into a soft policy iteration procedure and prove that it converges. We further show that as memory grows and more densely covers the task environment, the resulting policy approaches optimality. Our framework unifies memory based reasoning with reinforcement learning and provides a formal foundation for LLM agents capable of continual, experience driven learning.

Related papers

Memory in the Age of AI Agents [217.9368190980982]
This work aims to provide an up-to-date landscape of current agent memory research.<n>We identify three dominant realizations of agent memory, namely token-level, parametric, and latent memory.<n>To support practical development, we compile a comprehensive summary of memory benchmarks and open-source frameworks.
arXiv Detail & Related papers (2025-12-15T17:22:34Z)
From Experience to Strategy: Empowering LLM Agents with Trainable Graph Memory [48.22750809620306]
Large Language Models (LLMs) based agents have demonstrated remarkable potential in autonomous task-solving.<n>In this paper, we introduce a novel agent-centric, trainable, multi-layered graph memory framework.<n>We show how context memory enhances the ability of LLMs to utilize information.
arXiv Detail & Related papers (2025-11-11T03:36:33Z)
Evaluating Long-Term Memory for Long-Context Question Answering [100.1267054069757]
We present a systematic evaluation of memory-augmented methods using LoCoMo, a benchmark of synthetic long-context dialogues annotated for question-answering tasks.<n>Our findings show that memory-augmented approaches reduce token usage by over 90% while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-10-27T18:03:50Z)
Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation [11.819481846962447]
We investigate how agents built on pretrained large language models can learn target classification functions from labeled examples without parameter updates.<n>Our framework uses episodic memory to store instance-level critiques and distill these into reusable, task-level guidance.<n>Our findings highlight the promise of memory-driven, reflective learning for building more adaptive and interpretable LLM agents.
arXiv Detail & Related papers (2025-10-22T17:58:03Z)
Memo: Training Memory-Efficient Embodied Agents with Reinforcement Learning [53.72709564555407]
Memo is a transformer-based architecture and training recipe for reinforcement learning.<n>It incorporates the creation and retrieval of memory by interleaving periodic summarization tokens with the inputs of a model during training.<n>We demonstrate Memo's effectiveness on a gridworld meta-RL benchmark and a multi-object navigation task in photo-realistic indoor settings.
arXiv Detail & Related papers (2025-10-22T16:24:47Z)
Memory as Action: Autonomous Context Curation for Long-Horizon Agentic Tasks [23.201035830828726]
Large Language Models face challenges in long-horizon agentic tasks.<n>Existing working memory methods rely on external mechanisms that are decoupled from the agent's core policy.<n>We propose a novel framework, Memory-as-Action, where an agent actively manages its working memory by executing explicit editing operations as part of a unified policy.
arXiv Detail & Related papers (2025-10-14T15:29:57Z)
CAM: A Constructivist View of Agentic Memory for LLM-Based Reading Comprehension [55.29309306566238]
Current Large Language Models (LLMs) are confronted with overwhelming information volume when comprehending long-form documents.<n>This challenge raises the imperative of a cohesive memory module, which can elevate vanilla LLMs into autonomous reading agents.<n>We draw inspiration from Jean Piaget's Constructivist Theory, illuminating three traits of the agentic memory -- structured schemata, flexible assimilation, and dynamic accommodation.
arXiv Detail & Related papers (2025-10-07T02:16:30Z)
AURORA: Augmented Understanding via Structured Reasoning and Reinforcement Learning for Reference Audio-Visual Segmentation [113.75682363364004]
AURORA is a framework designed to enhance genuine reasoning and language comprehension in reference audio-visual segmentation.<n>AURORA achieves state-of-the-art performance on Ref-AVS benchmarks and generalizes effectively to unreferenced segmentation.
arXiv Detail & Related papers (2025-08-04T07:47:38Z)
Memory Allocation in Resource-Constrained Reinforcement Learning [8.866141780407903]
Resource constraints can fundamentally change both learning and decision-making.<n>We explore how memory constraints influence an agent's performance when navigating unknown environments using standard reinforcement learning algorithms.<n>Specifically, memory-constrained agents face a dilemma: how much of their limited memory should be allocated to each of the agent's internal processes, such as estimating a world model, as opposed to forming a plan using that model?
arXiv Detail & Related papers (2025-06-09T21:15:37Z)
Exploring Synaptic Resonance in Large Language Models: A Novel Approach to Contextual Memory Integration [0.0]
A novel mechanism, Synaptic Resonance, is introduced to dynamically reinforce relevant memory pathways during training and inference.<n> Evaluations conducted on an open-source language model demonstrate reductions in perplexity, enhancements in contextual coherence, and increased robustness against input noise.
arXiv Detail & Related papers (2025-02-15T07:06:10Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
In-Memory Learning: A Declarative Learning Framework for Large Language Models [56.62616975119192]
We propose a novel learning framework that allows agents to align with their environment without relying on human-labeled data. This entire process transpires within the memory components and is implemented through natural language. We demonstrate the effectiveness of our framework and provide insights into this problem.
arXiv Detail & Related papers (2024-03-05T08:25:11Z)
Shattering the Agent-Environment Interface for Fine-Tuning Inclusive Language Models [24.107358120517336]
In this work, we adopt a novel perspective wherein a pre-trained language model is itself simultaneously a policy, reward function, and transition function. An immediate consequence of this is that reward learning and language model fine-tuning can be performed jointly and directly, without requiring any further downstream policy optimization.
arXiv Detail & Related papers (2023-05-19T06:21:15Z)
On the Relationship Between Variational Inference and Auto-Associative Memory [68.8204255655161]
We study how different neural network approaches to variational inference can be applied in this framework. We evaluate the obtained algorithms on the CIFAR10 and CLEVR image datasets and compare them with other associative memory models.
arXiv Detail & Related papers (2022-10-14T14:18:47Z)
Generalized Reinforcement Learning: Experience Particles, Action Operator, Reinforcement Field, Memory Association, and Decision Concepts [2.398608007786179]
This paper proposes a Bayesian-flavored generalized reinforcement learning framework. We first establish the notion of parametric action model to better cope with uncertainty and fluid action behaviors. We then introduce the notion of reinforcement field as a physics-inspired construct established through "polarized experience particles" maintained in the learning agent's working memory.
arXiv Detail & Related papers (2022-08-09T15:05:15Z)
Pin the Memory: Learning to Generalize Semantic Segmentation [68.367763672095]
We present a novel memory-guided domain generalization method for semantic segmentation based on meta-learning framework. Our method abstracts the conceptual knowledge of semantic classes into categorical memory which is constant beyond the domains.
arXiv Detail & Related papers (2022-04-07T17:34:01Z)
Improving Meta-learning for Low-resource Text Classification and Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation. A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z)
Learning What to Memorize: Using Intrinsic Motivation to Form Useful Memory in Partially Observable Reinforcement Learning [0.0]
In order to learn in an ambiguous environment, an agent has to keep previous perceptions in a memory. In this study, we follow the idea of giving the control of the memory to the agent by allowing it to have memory-changing actions. This learning mechanism is supported by an intrinsic motivation to memorize rare observations that can help the agent to disambiguate its state in the environment.
arXiv Detail & Related papers (2021-10-25T11:15:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.