Spatially-Aware Transformer for Embodied Agents
- URL: http://arxiv.org/abs/2402.15160v3
- Date: Fri, 1 Mar 2024 00:58:50 GMT
- Title: Spatially-Aware Transformer for Embodied Agents
- Authors: Junmo Cho, Jaesik Yoon, Sungjin Ahn
- Abstract summary: This paper explores the use of Spatially-Aware Transformer models that incorporate spatial information.
We demonstrate that memory utilization efficiency can be improved, leading to enhanced accuracy in various place-centric downstream tasks.
We also propose the Adaptive Memory Allocator, a memory management method based on reinforcement learning.
- Score: 20.498778205143477
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Episodic memory plays a crucial role in various cognitive processes, such as
the ability to mentally recall past events. While cognitive science emphasizes
the significance of spatial context in the formation and retrieval of episodic
memory, the current primary approach to implementing episodic memory in AI
systems is through transformers that store temporally ordered experiences,
which overlooks the spatial dimension. As a result, it is unclear how the
underlying structure could be extended to incorporate the spatial axis beyond
temporal order alone and thereby what benefits can be obtained. To address
this, this paper explores the use of Spatially-Aware Transformer models that
incorporate spatial information. These models enable the creation of
place-centric episodic memory that considers both temporal and spatial
dimensions. Adopting this approach, we demonstrate that memory utilization
efficiency can be improved, leading to enhanced accuracy in various
place-centric downstream tasks. Additionally, we propose the Adaptive Memory
Allocator, a memory management method based on reinforcement learning that aims
to optimize efficiency of memory utilization. Our experiments demonstrate the
advantages of our proposed model in various environments and across multiple
downstream tasks, including prediction, generation, reasoning, and
reinforcement learning. The source code for our models and experiments will be
available at https://github.com/junmokane/spatially-aware-transformer.
Related papers
- Stable Hadamard Memory: Revitalizing Memory-Augmented Agents for Reinforcement Learning [64.93848182403116]
Current deep-learning memory models struggle in reinforcement learning environments that are partially observable and long-term.
We introduce the Stable Hadamard Memory, a novel memory model for reinforcement learning agents.
Our approach significantly outperforms state-of-the-art memory-based methods on challenging partially observable benchmarks.
arXiv Detail & Related papers (2024-10-14T03:50:17Z) - Remember and Recall: Associative-Memory-based Trajectory Prediction [25.349986959111757]
We propose the Fragmented-Memory-based Trajectory Prediction (FMTP) model, inspired by the remarkable learning capabilities of humans.
The FMTP model employs discrete representations to enhance computational efficiency by reducing information redundancy.
We develop an advanced reasoning engine based on language models to deeply learn the associative rules among these discrete representations.
arXiv Detail & Related papers (2024-10-03T04:32:21Z) - Cached Transformers: Improving Transformers with Differentiable Memory
Cache [71.28188777209034]
This work introduces a new Transformer model called Cached Transformer.
It uses Gated Recurrent Cached (GRC) attention to extend the self-attention mechanism with a differentiable memory cache of tokens.
arXiv Detail & Related papers (2023-12-20T03:30:51Z) - Memory-and-Anticipation Transformer for Online Action Understanding [52.24561192781971]
We propose a novel memory-anticipation-based paradigm to model an entire temporal structure, including the past, present, and future.
We present Memory-and-Anticipation Transformer (MAT), a memory-anticipation-based approach, to address the online action detection and anticipation tasks.
arXiv Detail & Related papers (2023-08-15T17:34:54Z) - Recurrent Action Transformer with Memory [39.58317527488534]
This paper proposes a novel model architecture that incorporates a recurrent memory mechanism designed to regulate information retention.
We conduct experiments on memory-intensive environments (ViZDoom-Two-Colors, T-Maze, Memory Maze, Minigrid-Memory), classic Atari games, and MuJoCo control environments.
The results show that using memory can significantly improve performance in memory-intensive environments, while maintaining or improving results in classic environments.
arXiv Detail & Related papers (2023-06-15T19:29:08Z) - Think Before You Act: Decision Transformers with Working Memory [44.18926449252084]
Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks.
We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training.
We propose a working memory module to store, blend, and retrieve information for different downstream tasks.
arXiv Detail & Related papers (2023-05-24T01:20:22Z) - Kanerva++: extending The Kanerva Machine with differentiable, locally
block allocated latent memory [75.65949969000596]
Episodic and semantic memory are critical components of the human memory model.
We develop a new principled Bayesian memory allocation scheme that bridges the gap between episodic and semantic memory.
We demonstrate that this allocation scheme improves performance in memory conditional image generation.
arXiv Detail & Related papers (2021-02-20T18:40:40Z) - End-to-End Egospheric Spatial Memory [32.42361470456194]
We propose a parameter-free module, Egospheric Spatial Memory (ESM), which encodes the memory in an ego-sphere around the agent.
ESM can be trained end-to-end via either imitation or reinforcement learning.
We show applications to semantic segmentation on the ScanNet dataset, where ESM naturally combines image-level and map-level inference modalities.
arXiv Detail & Related papers (2021-02-15T18:59:07Z) - Learning to Learn Variational Semantic Memory [132.39737669936125]
We introduce variational semantic memory into meta-learning to acquire long-term knowledge for few-shot learning.
The semantic memory is grown from scratch and gradually consolidated by absorbing information from tasks it experiences.
We formulate memory recall as the variational inference of a latent memory variable from addressed contents.
arXiv Detail & Related papers (2020-10-20T15:05:26Z) - Memformer: A Memory-Augmented Transformer for Sequence Modeling [55.780849185884996]
We present Memformer, an efficient neural network for sequence modeling.
Our model achieves linear time complexity and constant memory space complexity when processing long sequences.
arXiv Detail & Related papers (2020-10-14T09:03:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.