The act of remembering: a study in partially observable reinforcement
learning
- URL: http://arxiv.org/abs/2010.01753v1
- Date: Mon, 5 Oct 2020 02:56:43 GMT
- Title: The act of remembering: a study in partially observable reinforcement
learning
- Authors: Rodrigo Toro Icarte, Richard Valenzano, Toryn Q. Klassen, Phillip
Christoffersen, Amir-massoud Farahmand, Sheila A. McIlraith
- Abstract summary: Reinforcement Learning (RL) agents typically learn memoryless policies that only consider the last observation when selecting actions.
We provide the agent with an external memory and additional actions to control what, if anything, is written to the memory.
Our novel forms of memory outperform binary and LSTM-based memory in well-established partially observable domains.
- Score: 24.945756871291348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement Learning (RL) agents typically learn memoryless
policies---policies that only consider the last observation when selecting
actions. Learning memoryless policies is efficient and optimal in fully
observable environments. However, some form of memory is necessary when RL
agents are faced with partial observability. In this paper, we study a
lightweight approach to tackle partial observability in RL. We provide the
agent with an external memory and additional actions to control what, if
anything, is written to the memory. At every step, the current memory state is
part of the agent's observation, and the agent selects a tuple of actions: one
action that modifies the environment and another that modifies the memory. When
the external memory is sufficiently expressive, optimal memoryless policies
yield globally optimal solutions. Unfortunately, previous attempts to use
external memory in the form of binary memory have produced poor results in
practice. Here, we investigate alternative forms of memory in support of
learning effective memoryless policies. Our novel forms of memory outperform
binary and LSTM-based memory in well-established partially observable domains.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - AdaMemento: Adaptive Memory-Assisted Policy Optimization for Reinforcement Learning [15.317710077291245]
We propose AdaMemento, an adaptive memory-enhanced reinforcement learning framework.
AdaMemento exploits both positive and negative experiences by learning to predict known local optimal policies.
We show that AdaMemento can distinguish subtle states for better exploration and effectively exploiting past experiences in memory.
arXiv Detail & Related papers (2024-10-06T14:39:39Z) - Reducing Fine-Tuning Memory Overhead by Approximate and Memory-Sharing Backpropagation [29.139579820699495]
This work strives to reduce memory overhead in fine-tuning from perspectives of activation function and layer normalization.
We apply our Approx-BP theory to backpropagation training and derive memory-efficient alternatives of GELU and SiLU activation functions.
In addition, we introduce a Memory-Sharing Backpropagation strategy, which enables the activation memory to be shared by two adjacent layers.
arXiv Detail & Related papers (2024-06-24T03:09:15Z) - Saliency-Guided Hidden Associative Replay for Continual Learning [13.551181595881326]
Continual Learning is a burgeoning domain in next-generation AI, focusing on training neural networks over a sequence of tasks akin to human learning.
This paper presents the Saliency Guided Hidden Associative Replay for Continual Learning.
This novel framework synergizes associative memory with replay-based strategies. SHARC primarily archives salient data segments via sparse memory encoding.
arXiv Detail & Related papers (2023-10-06T15:54:12Z) - Lift Yourself Up: Retrieval-augmented Text Generation with Self Memory [72.36736686941671]
We propose a novel framework, selfmem, for improving retrieval-augmented generation models.
Selfmem iteratively employs a retrieval-augmented generator to create an unbounded memory pool and using a memory selector to choose one output as memory for the subsequent generation round.
We evaluate the effectiveness of selfmem on three distinct text generation tasks.
arXiv Detail & Related papers (2023-05-03T21:40:54Z) - RMM: Reinforced Memory Management for Class-Incremental Learning [102.20140790771265]
Class-Incremental Learning (CIL) trains classifiers under a strict memory budget.
Existing methods use a static and ad hoc strategy for memory allocation, which is often sub-optimal.
We propose a dynamic memory management strategy that is optimized for the incremental phases and different object classes.
arXiv Detail & Related papers (2023-01-14T00:07:47Z) - Saliency-Augmented Memory Completion for Continual Learning [8.243137410556495]
How to forget is a problem continual learning must address.
Our paper proposes a new saliency-augmented memory completion framework for continual learning.
arXiv Detail & Related papers (2022-12-26T18:06:39Z) - A Model or 603 Exemplars: Towards Memory-Efficient Class-Incremental
Learning [56.450090618578]
Class-Incremental Learning (CIL) aims to train a model with limited memory size to meet this requirement.
We show that when counting the model size into the total budget and comparing methods with aligned memory size, saving models do not consistently work.
We propose a simple yet effective baseline, denoted as MEMO for Memory-efficient Expandable MOdel.
arXiv Detail & Related papers (2022-05-26T08:24:01Z) - Learning What to Memorize: Using Intrinsic Motivation to Form Useful
Memory in Partially Observable Reinforcement Learning [0.0]
In order to learn in an ambiguous environment, an agent has to keep previous perceptions in a memory.
In this study, we follow the idea of giving the control of the memory to the agent by allowing it to have memory-changing actions.
This learning mechanism is supported by an intrinsic motivation to memorize rare observations that can help the agent to disambiguate its state in the environment.
arXiv Detail & Related papers (2021-10-25T11:15:54Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - Self-Attentive Associative Memory [69.40038844695917]
We propose to separate the storage of individual experiences (item memory) and their occurring relationships (relational memory)
We achieve competitive results with our proposed two-memory model in a diversity of machine learning tasks.
arXiv Detail & Related papers (2020-02-10T03:27:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.