AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning
- URL: http://arxiv.org/abs/2410.04498v1
- Date: Sun, 6 Oct 2024 14:39:39 GMT
- Title: AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning
- Authors: Renye Yan, Yaozhong Gan, You Wu, Junliang Xing, Ling Liangn, Yeshang Zhu, Yimao Cai,
- Abstract summary: We propose AdaMemento, an adaptive memory-enhanced reinforcement learning framework.
AdaMemento exploits both positive and negative experiences by learning to predict known local optimal policies.
We show that AdaMemento can distinguish subtle states for better exploration and effectively exploiting past experiences in memory.
- Score: 15.317710077291245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In sparse reward scenarios of reinforcement learning (RL), the memory mechanism provides promising shortcuts to policy optimization by reflecting on past experiences like humans. However, current memory-based RL methods simply store and reuse high-value policies, lacking a deeper refining and filtering of diverse past experiences and hence limiting the capability of memory. In this paper, we propose AdaMemento, an adaptive memory-enhanced RL framework. Instead of just memorizing positive past experiences, we design a memory-reflection module that exploits both positive and negative experiences by learning to predict known local optimal policies based on real-time states. To effectively gather informative trajectories for the memory, we further introduce a fine-grained intrinsic motivation paradigm, where nuances in similar states can be precisely distinguished to guide exploration. The exploitation of past experiences and exploration of new policies are then adaptively coordinated by ensemble learning to approach the global optimum. Furthermore, we theoretically prove the superiority of our new intrinsic motivation and ensemble mechanism. From 59 quantitative and visualization experiments, we confirm that AdaMemento can distinguish subtle states for better exploration and effectively exploiting past experiences in memory, achieving significant improvement over previous methods.
Related papers
- MetaMem: Evolving Meta-Memory for Knowledge Utilization through Self-Reflective Symbolic Optimization [57.17751568928966]
We propose MetaMem, a framework that augments memory systems with a self-evolving meta-memory.<n>During meta-memory optimization, MetaMem iteratively distills transferable knowledge utilization experiences across different tasks.<n>Extensive experiments demonstrate the effectiveness of MetaMem, which significantly outperforms strong baselines by over 3.6%.
arXiv Detail & Related papers (2026-01-27T04:46:23Z) - Memento 2: Learning by Stateful Reflective Memory [4.7052412989773975]
We study continual learning in large language model (LLM) based agents that integrate episodic memory with reinforcement learning.<n>We focus on reflection, the ability of an agent to revisit past experience and adjust how it selects future actions.<n>We introduce the Stateful Reflective Decision Process (SRDP), in which an agent maintains and updates episodic memory and alternates between writing new experiences to memory and reading relevant cases to guide decisions.
arXiv Detail & Related papers (2025-12-27T22:15:03Z) - Retrieval-augmented Prompt Learning for Pre-trained Foundation Models [101.13972024610733]
We present RetroPrompt, which aims to achieve a balance between memorization and generalization.<n>Unlike traditional prompting methods, RetroPrompt incorporates a retrieval mechanism throughout the input, training, and inference stages.<n>We conduct comprehensive experiments on a variety of datasets across natural language processing and computer vision tasks to demonstrate the superior performance of our proposed approach.
arXiv Detail & Related papers (2025-12-23T08:15:34Z) - Memory in the Age of AI Agents [217.9368190980982]
This work aims to provide an up-to-date landscape of current agent memory research.<n>We identify three dominant realizations of agent memory, namely token-level, parametric, and latent memory.<n>To support practical development, we compile a comprehensive summary of memory benchmarks and open-source frameworks.
arXiv Detail & Related papers (2025-12-15T17:22:34Z) - Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution [52.76038908826961]
We propose $textbfReMe$ ($textitRemember Me, Refine Me$) to bridge the gap between static storage and dynamic reasoning.<n>ReMe innovates across the memory lifecycle via three mechanisms: $textitmulti-faceted distillation$, which extracts fine-grained experiences.<n>Experiments on BFCL-V3 and AppWorld demonstrate that ReMe establishes a new state-of-the-art in agent memory system.
arXiv Detail & Related papers (2025-12-11T14:40:01Z) - Evaluating Long-Term Memory for Long-Context Question Answering [100.1267054069757]
We present a systematic evaluation of memory-augmented methods using LoCoMo, a benchmark of synthetic long-context dialogues annotated for question-answering tasks.<n>Our findings show that memory-augmented approaches reduce token usage by over 90% while maintaining competitive accuracy.
arXiv Detail & Related papers (2025-10-27T18:03:50Z) - Learning from Supervision with Semantic and Episodic Memory: A Reflective Approach to Agent Adaptation [11.819481846962447]
We investigate how agents built on pretrained large language models can learn target classification functions from labeled examples without parameter updates.<n>Our framework uses episodic memory to store instance-level critiques and distill these into reusable, task-level guidance.<n>Our findings highlight the promise of memory-driven, reflective learning for building more adaptive and interpretable LLM agents.
arXiv Detail & Related papers (2025-10-22T17:58:03Z) - Memory-enhanced Retrieval Augmentation for Long Video Understanding [57.371543819761555]
We introduce a novel RAG-based LVU approach inspired by the cognitive memory of human beings, which is called MemVid.
Our approach operates with four basics steps: memorizing holistic video information, reasoning about the task's information needs based on the memory, retrieving critical moments based on the information needs, and focusing on the retrieved moments to produce the final answer.
arXiv Detail & Related papers (2025-03-12T08:23:32Z) - Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - In-Memory Learning: A Declarative Learning Framework for Large Language
Models [56.62616975119192]
We propose a novel learning framework that allows agents to align with their environment without relying on human-labeled data.
This entire process transpires within the memory components and is implemented through natural language.
We demonstrate the effectiveness of our framework and provide insights into this problem.
arXiv Detail & Related papers (2024-03-05T08:25:11Z) - Spatially-Aware Transformer for Embodied Agents [20.498778205143477]
This paper explores the use of Spatially-Aware Transformer models that incorporate spatial information.
We demonstrate that memory utilization efficiency can be improved, leading to enhanced accuracy in various place-centric downstream tasks.
We also propose the Adaptive Memory Allocator, a memory management method based on reinforcement learning.
arXiv Detail & Related papers (2024-02-23T07:46:30Z) - Saliency-Guided Hidden Associative Replay for Continual Learning [13.551181595881326]
Continual Learning is a burgeoning domain in next-generation AI, focusing on training neural networks over a sequence of tasks akin to human learning.
This paper presents the Saliency Guided Hidden Associative Replay for Continual Learning.
This novel framework synergizes associative memory with replay-based strategies. SHARC primarily archives salient data segments via sparse memory encoding.
arXiv Detail & Related papers (2023-10-06T15:54:12Z) - EMO: Episodic Memory Optimization for Few-Shot Meta-Learning [69.50380510879697]
episodic memory optimization for meta-learning, we call EMO, is inspired by the human ability to recall past learning experiences from the brain's memory.
EMO nudges parameter updates in the right direction, even when the gradients provided by a limited number of examples are uninformative.
EMO scales well with most few-shot classification benchmarks and improves the performance of optimization-based meta-learning methods.
arXiv Detail & Related papers (2023-06-08T13:39:08Z) - Saliency-Augmented Memory Completion for Continual Learning [8.243137410556495]
How to forget is a problem continual learning must address.
Our paper proposes a new saliency-augmented memory completion framework for continual learning.
arXiv Detail & Related papers (2022-12-26T18:06:39Z) - Reward Uncertainty for Exploration in Preference-based Reinforcement
Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms.
Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward.
Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z) - Replay For Safety [51.11953997546418]
In experience replay, past transitions are stored in a memory buffer and re-used during learning.
We show that using an appropriate biased sampling scheme can allow us to achieve a emphsafe policy.
arXiv Detail & Related papers (2021-12-08T11:10:57Z) - Saliency Guided Experience Packing for Replay in Continual Learning [6.417011237981518]
We propose a new approach for experience replay, where we select the past experiences by looking at the saliency maps.
While learning a new task, we replay these memory patches with appropriate zero-padding to remind the model about its past decisions.
arXiv Detail & Related papers (2021-09-10T15:54:58Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z) - The act of remembering: a study in partially observable reinforcement
learning [24.945756871291348]
Reinforcement Learning (RL) agents typically learn memoryless policies that only consider the last observation when selecting actions.
We provide the agent with an external memory and additional actions to control what, if anything, is written to the memory.
Our novel forms of memory outperform binary and LSTM-based memory in well-established partially observable domains.
arXiv Detail & Related papers (2020-10-05T02:56:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.