Towards mental time travel: a hierarchical memory for reinforcement
learning agents
- URL: http://arxiv.org/abs/2105.14039v1
- Date: Fri, 28 May 2021 18:12:28 GMT
- Title: Towards mental time travel: a hierarchical memory for reinforcement
learning agents
- Authors: Andrew Kyle Lampinen, Stephanie C.Y. Chan, Andrea Banino, Felix Hill
- Abstract summary: Reinforcement learning agents often forget details of the past, especially after delays or distractor tasks.
We propose a Hierarchical Transformer Memory (HTM) which helps agents to remember the past in detail.
Agents with HTM can extrapolate to task sequences an order of magnitude longer than they were trained on, and can even generalize zero-shot from a meta-learning setting to maintaining knowledge across episodes.
- Score: 9.808027857786781
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning agents often forget details of the past, especially
after delays or distractor tasks. Agents with common memory architectures
struggle to recall and integrate across multiple timesteps of a past event, or
even to recall the details of a single timestep that is followed by distractor
tasks. To address these limitations, we propose a Hierarchical Transformer
Memory (HTM), which helps agents to remember the past in detail. HTM stores
memories by dividing the past into chunks, and recalls by first performing
high-level attention over coarse summaries of the chunks, and then performing
detailed attention within only the most relevant chunks. An agent with HTM can
therefore "mentally time-travel" -- remember past events in detail without
attending to all intervening events. We show that agents with HTM substantially
outperform agents with other memory architectures at tasks requiring long-term
recall, retention, or reasoning over memory. These include recalling where an
object is hidden in a 3D environment, rapidly learning to navigate efficiently
in a new neighborhood, and rapidly learning and retaining new object names.
Agents with HTM can extrapolate to task sequences an order of magnitude longer
than they were trained on, and can even generalize zero-shot from a
meta-learning setting to maintaining knowledge across episodes. HTM improves
agent sample efficiency, generalization, and generality (by solving tasks that
previously required specialized architectures). Our work is a step towards
agents that can learn, interact, and adapt in complex and temporally-extended
environments.
Related papers
- KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems [12.461941212597877]
Embodied AI agents often face difficulties with in-context memory, leading to inefficiencies and errors in task execution.
We introduce KARMA, an innovative memory system that integrates long-term and short-term memory modules.
This dual-memory structure allows agents to retrieve relevant past scene experiences, thereby improving the accuracy and efficiency of task planning.
arXiv Detail & Related papers (2024-09-23T11:02:46Z) - A Survey on the Memory Mechanism of Large Language Model based Agents [66.4963345269611]
Large language model (LLM) based agents have recently attracted much attention from the research and industry communities.
LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems.
The key component to support agent-environment interactions is the memory of the agents.
arXiv Detail & Related papers (2024-04-21T01:49:46Z) - Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks.
Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks.
We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z) - Semantic HELM: A Human-Readable Memory for Reinforcement Learning [9.746397419479445]
We propose a novel memory mechanism that represents past events in human language.
We train our memory mechanism on a set of partially observable environments and find that it excels on tasks that require a memory component.
Since our memory mechanism is human-readable, we can peek at an agent's memory and check whether crucial pieces of information have been stored.
arXiv Detail & Related papers (2023-06-15T17:47:31Z) - Evaluating Long-Term Memory in 3D Mazes [10.224858246626171]
Memory Maze is a 3D domain of randomized mazes designed for evaluating long-term memory in agents.
Unlike existing benchmarks, Memory Maze measures long-term memory separate from confounding agent abilities.
We find that current algorithms benefit from training with truncated backpropagation through time and succeed on small mazes, but fall short of human performance on the large mazes.
arXiv Detail & Related papers (2022-10-24T16:32:28Z) - Memory-Guided Semantic Learning Network for Temporal Sentence Grounding [55.31041933103645]
We propose a memory-augmented network that learns and memorizes the rarely appeared content in TSG tasks.
MGSL-Net consists of three main parts: a cross-modal inter-action module, a memory augmentation module, and a heterogeneous attention module.
arXiv Detail & Related papers (2022-01-03T02:32:06Z) - Learning to Rehearse in Long Sequence Memorization [107.14601197043308]
Existing reasoning tasks often have an important assumption that the input contents can be always accessed while reasoning.
Memory augmented neural networks introduce a human-like write-read memory to compress and memorize the long input sequence in one pass.
But they have two serious drawbacks: 1) they continually update the memory from current information and inevitably forget the early contents; 2) they do not distinguish what information is important and treat all contents equally.
We propose the Rehearsal Memory to enhance long-sequence memorization by self-supervised rehearsal with a history sampler.
arXiv Detail & Related papers (2021-06-02T11:58:30Z) - Not All Memories are Created Equal: Learning to Forget by Expiring [49.053569908417636]
We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information.
This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently.
We show that Expire-Span can scale to memories that are tens of thousands in size, setting a new state of the art on incredibly long context tasks.
arXiv Detail & Related papers (2021-05-13T20:50:13Z) - Continual Learning in Low-rank Orthogonal Subspaces [86.36417214618575]
In continual learning (CL), a learner is faced with a sequence of tasks, arriving one after the other, and the goal is to remember all the tasks once the learning experience is finished.
The prior art in CL uses episodic memory, parameter regularization or network structures to reduce interference among tasks, but in the end, all the approaches learn different tasks in a joint vector space.
We propose to learn tasks in different (low-rank) vector subspaces that are kept orthogonal to each other in order to minimize interference.
arXiv Detail & Related papers (2020-10-22T12:07:43Z) - Perception-Prediction-Reaction Agents for Deep Reinforcement Learning [12.566380944901816]
We introduce a new recurrent agent architecture which improves reinforcement learning in tasks requiring long-term memory.
A new auxiliary loss regularizes policies drawn from all three cores against each other, enacting the prior that the policy should be expressible from either recent or long-term memory.
arXiv Detail & Related papers (2020-06-26T21:53:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.