Related papers: Towards mental time travel: a hierarchical memory for reinforcement learning agents

Towards mental time travel: a hierarchical memory for reinforcement learning agents

URL: http://arxiv.org/abs/2105.14039v1
Date: Fri, 28 May 2021 18:12:28 GMT
Title: Towards mental time travel: a hierarchical memory for reinforcement learning agents
Authors: Andrew Kyle Lampinen, Stephanie C.Y. Chan, Andrea Banino, Felix Hill
Abstract summary: Reinforcement learning agents often forget details of the past, especially after delays or distractor tasks. We propose a Hierarchical Transformer Memory (HTM) which helps agents to remember the past in detail. Agents with HTM can extrapolate to task sequences an order of magnitude longer than they were trained on, and can even generalize zero-shot from a meta-learning setting to maintaining knowledge across episodes.
Score: 9.808027857786781
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Reinforcement learning agents often forget details of the past, especially after delays or distractor tasks. Agents with common memory architectures struggle to recall and integrate across multiple timesteps of a past event, or even to recall the details of a single timestep that is followed by distractor tasks. To address these limitations, we propose a Hierarchical Transformer Memory (HTM), which helps agents to remember the past in detail. HTM stores memories by dividing the past into chunks, and recalls by first performing high-level attention over coarse summaries of the chunks, and then performing detailed attention within only the most relevant chunks. An agent with HTM can therefore "mentally time-travel" -- remember past events in detail without attending to all intervening events. We show that agents with HTM substantially outperform agents with other memory architectures at tasks requiring long-term recall, retention, or reasoning over memory. These include recalling where an object is hidden in a 3D environment, rapidly learning to navigate efficiently in a new neighborhood, and rapidly learning and retaining new object names. Agents with HTM can extrapolate to task sequences an order of magnitude longer than they were trained on, and can even generalize zero-shot from a meta-learning setting to maintaining knowledge across episodes. HTM improves agent sample efficiency, generalization, and generality (by solving tasks that previously required specialized architectures). Our work is a step towards agents that can learn, interact, and adapt in complex and temporally-extended environments.

Related papers

FindingDory: A Benchmark to Evaluate Memory in Embodied Agents [49.89792845476579]
We introduce a new benchmark for long-range embodied tasks in the Habitat simulator.<n>This benchmark evaluates memory-based capabilities across 60 tasks requiring sustained engagement and contextual awareness.
arXiv Detail & Related papers (2025-06-18T17:06:28Z)
3DLLM-Mem: Long-Term Spatial-Temporal Memory for Embodied 3D Large Language Model [83.70640091897947]
Humans excel at performing complex tasks by leveraging long-term memory across temporal and spatial experiences.<n>Current Large Language Models (LLMs) struggle to effectively plan and act in dynamic, multi-room 3D environments.<n>We propose 3DLLM-Mem, a novel dynamic memory management and fusion model for embodied spatial-temporal reasoning and actions.
arXiv Detail & Related papers (2025-05-28T17:59:13Z)
Memento No More: Coaching AI Agents to Master Multiple Tasks via Hints Internalization [56.674356045200696]
We propose a novel method to train AI agents to incorporate knowledge and skills for multiple tasks without the need for cumbersome note systems or prior high-quality demonstration data. Our approach employs an iterative process where the agent collects new experiences, receives corrective feedback from humans in the form of hints, and integrates this feedback into its weights. We demonstrate the efficacy of our approach by implementing it in a Llama-3-based agent which, after only a few rounds of feedback, outperforms advanced models GPT-4o and DeepSeek-V3 in a taskset.
arXiv Detail & Related papers (2025-02-03T17:45:46Z)
KARMA: Augmenting Embodied AI Agents with Long-and-short Term Memory Systems [12.461941212597877]
Embodied AI agents often face difficulties with in-context memory, leading to inefficiencies and errors in task execution. We introduce KARMA, an innovative memory system that integrates long-term and short-term memory modules. This dual-memory structure allows agents to retrieve relevant past scene experiences, thereby improving the accuracy and efficiency of task planning.
arXiv Detail & Related papers (2024-09-23T11:02:46Z)
A Survey on the Memory Mechanism of Large Language Model based Agents [66.4963345269611]
Large language model (LLM) based agents have recently attracted much attention from the research and industry communities. LLM-based agents are featured in their self-evolving capability, which is the basis for solving real-world problems. The key component to support agent-environment interactions is the memory of the agents.
arXiv Detail & Related papers (2024-04-21T01:49:46Z)
Exploring Memorization in Fine-tuned Language Models [53.52403444655213]
We conduct the first comprehensive analysis to explore language models' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that memorization presents a strong disparity among different fine-tuning tasks. We provide an intuitive explanation of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution.
arXiv Detail & Related papers (2023-10-10T15:41:26Z)
Semantic HELM: A Human-Readable Memory for Reinforcement Learning [9.746397419479445]
We propose a novel memory mechanism that represents past events in human language. We train our memory mechanism on a set of partially observable environments and find that it excels on tasks that require a memory component. Since our memory mechanism is human-readable, we can peek at an agent's memory and check whether crucial pieces of information have been stored.
arXiv Detail & Related papers (2023-06-15T17:47:31Z)
Evaluating Long-Term Memory in 3D Mazes [10.224858246626171]
Memory Maze is a 3D domain of randomized mazes designed for evaluating long-term memory in agents. Unlike existing benchmarks, Memory Maze measures long-term memory separate from confounding agent abilities. We find that current algorithms benefit from training with truncated backpropagation through time and succeed on small mazes, but fall short of human performance on the large mazes.
arXiv Detail & Related papers (2022-10-24T16:32:28Z)
Memory-Guided Semantic Learning Network for Temporal Sentence Grounding [55.31041933103645]
We propose a memory-augmented network that learns and memorizes the rarely appeared content in TSG tasks. MGSL-Net consists of three main parts: a cross-modal inter-action module, a memory augmentation module, and a heterogeneous attention module.
arXiv Detail & Related papers (2022-01-03T02:32:06Z)
Learning to Rehearse in Long Sequence Memorization [107.14601197043308]
Existing reasoning tasks often have an important assumption that the input contents can be always accessed while reasoning. Memory augmented neural networks introduce a human-like write-read memory to compress and memorize the long input sequence in one pass. But they have two serious drawbacks: 1) they continually update the memory from current information and inevitably forget the early contents; 2) they do not distinguish what information is important and treat all contents equally. We propose the Rehearsal Memory to enhance long-sequence memorization by self-supervised rehearsal with a history sampler.
arXiv Detail & Related papers (2021-06-02T11:58:30Z)
Not All Memories are Created Equal: Learning to Forget by Expiring [49.053569908417636]
We propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information. This forgetting of memories enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently. We show that Expire-Span can scale to memories that are tens of thousands in size, setting a new state of the art on incredibly long context tasks.
arXiv Detail & Related papers (2021-05-13T20:50:13Z)
Continual Learning in Low-rank Orthogonal Subspaces [86.36417214618575]
In continual learning (CL), a learner is faced with a sequence of tasks, arriving one after the other, and the goal is to remember all the tasks once the learning experience is finished. The prior art in CL uses episodic memory, parameter regularization or network structures to reduce interference among tasks, but in the end, all the approaches learn different tasks in a joint vector space. We propose to learn tasks in different (low-rank) vector subspaces that are kept orthogonal to each other in order to minimize interference.
arXiv Detail & Related papers (2020-10-22T12:07:43Z)
Perception-Prediction-Reaction Agents for Deep Reinforcement Learning [12.566380944901816]
We introduce a new recurrent agent architecture which improves reinforcement learning in tasks requiring long-term memory. A new auxiliary loss regularizes policies drawn from all three cores against each other, enacting the prior that the policy should be expressible from either recent or long-term memory.
arXiv Detail & Related papers (2020-06-26T21:53:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.