Recurrent Action Transformer with Memory
- URL: http://arxiv.org/abs/2306.09459v5
- Date: Mon, 14 Oct 2024 14:33:48 GMT
- Title: Recurrent Action Transformer with Memory
- Authors: Egor Cherepanov, Alexey Staroverov, Dmitry Yudin, Alexey K. Kovalev, Aleksandr I. Panov,
- Abstract summary: This paper proposes a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention.
We conduct experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments.
The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments.
- Score: 39.58317527488534
- License:
- Abstract: Recently, the use of transformers in offline reinforcement learning has become a rapidly developing area. This is due to their ability to treat the agent's trajectory in the environment as a sequence, thereby reducing the policy learning problem to sequence modeling. In environments where the agent's decisions depend on past events (POMDPs), it is essential to capture both the event itself and the decision point in the context of the model. However, the quadratic complexity of the attention mechanism limits the potential for context expansion. One solution to this problem is to extend transformers with memory mechanisms. This paper proposes a Recurrent Action Transformer with Memory (RATE), a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention. To evaluate our model, we conducted extensive experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments. The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments. We believe that our results will stimulate research on memory mechanisms for transformers applicable to offline reinforcement learning.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Spatially-Aware Transformer for Embodied Agents [20.498778205143477]
This paper explores the use of Spatially-Aware Transformer models that incorporate spatial information.
We demonstrate that memory utilization efficiency can be improved, leading to enhanced accuracy in various place-centric downstream tasks.
We also propose the Adaptive Memory Allocator, a memory management method based on reinforcement learning.
arXiv Detail & Related papers (2024-02-23T07:46:30Z) - Cached Transformers: Improving Transformers with Differentiable Memory
Cache [71.28188777209034]
This work introduces a new Transformer model called Cached Transformer.
It uses Gated Recurrent Cached (GRC) attention to extend the self-attention mechanism with a differentiable memory cache of tokens.
arXiv Detail & Related papers (2023-12-20T03:30:51Z) - Memory-efficient Stochastic methods for Memory-based Transformers [3.360916255196531]
Memory-based transformers can require a large amount of memory and can be quite inefficient.
We propose a novel two-phase training mechanism and a novel regularization technique to improve the training efficiency of memory-based transformers.
arXiv Detail & Related papers (2023-11-14T12:37:25Z) - Think Before You Act: Decision Transformers with Working Memory [44.18926449252084]
Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks.
We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.
We propose a working memory module to store, blend, and retrieve information for different downstream tasks.
arXiv Detail & Related papers (2023-05-24T01:20:22Z) - Transformers are Meta-Reinforcement Learners [0.060917028769172814]
We present TrMRL, a meta-RL agent that mimics the memory reinstatement mechanism using the transformer architecture.
We show that the self-attention computes a consensus representation that minimizes the Bayes Risk at each layer.
Results show that TrMRL presents comparable or superior performance, sample efficiency, and out-of-distribution generalization.
arXiv Detail & Related papers (2022-06-14T06:21:13Z) - Mesa: A Memory-saving Training Framework for Transformers [58.78933015299703]
We present Mesa, a memory-saving training framework for Transformers.
Mesa uses exact activations during forward pass while storing a low-precision version of activations to reduce memory consumption during training.
Experiments on ImageNet, CIFAR-100 and ADE20K demonstrate that Mesa can reduce half of the memory footprints during training.
arXiv Detail & Related papers (2021-11-22T11:23:01Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.