Related papers: AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning

AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning

URL: http://arxiv.org/abs/2410.04498v1
Date: Sun, 6 Oct 2024 14:39:39 GMT
Title: AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning
Authors: Renye Yan, Yaozhong Gan, You Wu, Junliang Xing, Ling Liangn, Yeshang Zhu, Yimao Cai,
Abstract summary: We propose AdaMemento, an adaptive memory-enhanced reinforcement learning framework. AdaMemento exploits both positive and negative experiences by learning to predict known local optimal policies. We show that AdaMemento can distinguish subtle states for better exploration and effectively exploiting past experiences in memory.
Score: 15.317710077291245
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In sparse reward scenarios of reinforcement learning (RL), the memory mechanism provides promising shortcuts to policy optimization by reflecting on past experiences like humans. However, current memory-based RL methods simply store and reuse high-value policies, lacking a deeper refining and filtering of diverse past experiences and hence limiting the capability of memory. In this paper, we propose AdaMemento, an adaptive memory-enhanced RL framework. Instead of just memorizing positive past experiences, we design a memory-reflection module that exploits both positive and negative experiences by learning to predict known local optimal policies based on real-time states. To effectively gather informative trajectories for the memory, we further introduce a fine-grained intrinsic motivation paradigm, where nuances in similar states can be precisely distinguished to guide exploration. The exploitation of past experiences and exploration of new policies are then adaptively coordinated by ensemble learning to approach the global optimum. Furthermore, we theoretically prove the superiority of our new intrinsic motivation and ensemble mechanism. From 59 quantitative and visualization experiments, we confirm that AdaMemento can distinguish subtle states for better exploration and effectively exploiting past experiences in memory, achieving significant improvement over previous methods.

Related papers

Memory-enhanced Retrieval Augmentation for Long Video Understanding [57.371543819761555]
We introduce a novel RAG-based LVU approach inspired by the cognitive memory of human beings, which is called MemVid. Our approach operates with four basics steps: memorizing holistic video information, reasoning about the task's information needs based on the memory, retrieving critical moments based on the information needs, and focusing on the retrieved moments to produce the final answer.
arXiv Detail & Related papers (2025-03-12T08:23:32Z)
Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term. We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents. Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z)
In-Memory Learning: A Declarative Learning Framework for Large Language Models [56.62616975119192]
We propose a novel learning framework that allows agents to align with their environment without relying on human-labeled data. This entire process transpires within the memory components and is implemented through natural language. We demonstrate the effectiveness of our framework and provide insights into this problem.
arXiv Detail & Related papers (2024-03-05T08:25:11Z)
Spatially-Aware Transformer for Embodied Agents [20.498778205143477]
This paper explores the use of Spatially-Aware Transformer models that incorporate spatial information. We demonstrate that memory utilization efficiency can be improved, leading to enhanced accuracy in various place-centric downstream tasks. We also propose the Adaptive Memory Allocator, a memory management method based on reinforcement learning.
arXiv Detail & Related papers (2024-02-23T07:46:30Z)
Saliency-Guided Hidden Associative Replay for Continual Learning [13.551181595881326]
Continual Learning is a burgeoning domain in next-generation AI, focusing on training neural networks over a sequence of tasks akin to human learning. This paper presents the Saliency Guided Hidden Associative Replay for Continual Learning. This novel framework synergizes associative memory with replay-based strategies. SHARC primarily archives salient data segments via sparse memory encoding.
arXiv Detail & Related papers (2023-10-06T15:54:12Z)
EMO: Episodic Memory Optimization for Few-Shot Meta-Learning [69.50380510879697]
episodic memory optimization for meta-learning, we call EMO, is inspired by the human ability to recall past learning experiences from the brain's memory. EMO nudges parameter updates in the right direction, even when the gradients provided by a limited number of examples are uninformative. EMO scales well with most few-shot classification benchmarks and improves the performance of optimization-based meta-learning methods.
arXiv Detail & Related papers (2023-06-08T13:39:08Z)
Saliency-Augmented Memory Completion for Continual Learning [8.243137410556495]
How to forget is a problem continual learning must address. Our paper proposes a new saliency-augmented memory completion framework for continual learning.
arXiv Detail & Related papers (2022-12-26T18:06:39Z)
Reward Uncertainty for Exploration in Preference-based Reinforcement Learning [88.34958680436552]
We present an exploration method specifically for preference-based reinforcement learning algorithms. Our main idea is to design an intrinsic reward by measuring the novelty based on learned reward. Our experiments show that exploration bonus from uncertainty in learned reward improves both feedback- and sample-efficiency of preference-based RL algorithms.
arXiv Detail & Related papers (2022-05-24T23:22:10Z)
Replay For Safety [51.11953997546418]
In experience replay, past transitions are stored in a memory buffer and re-used during learning. We show that using an appropriate biased sampling scheme can allow us to achieve a emphsafe policy.
arXiv Detail & Related papers (2021-12-08T11:10:57Z)
Saliency Guided Experience Packing for Replay in Continual Learning [6.417011237981518]
We propose a new approach for experience replay, where we select the past experiences by looking at the saliency maps. While learning a new task, we replay these memory patches with appropriate zero-padding to remind the model about its past decisions.
arXiv Detail & Related papers (2021-09-10T15:54:58Z)
Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning. The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences. We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z)
The act of remembering: a study in partially observable reinforcement learning [24.945756871291348]
Reinforcement Learning (RL) agents typically learn memoryless policies that only consider the last observation when selecting actions. We provide the agent with an external memory and additional actions to control what, if anything, is written to the memory. Our novel forms of memory outperform binary and LSTM-based memory in well-established partially observable domains.
arXiv Detail & Related papers (2020-10-05T02:56:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.