TransDreamer: Reinforcement Learning with Transformer World Models
- URL: http://arxiv.org/abs/2202.09481v2
- Date: Tue, 19 Nov 2024 16:55:55 GMT
- Title: TransDreamer: Reinforcement Learning with Transformer World Models
- Authors: Chang Chen, Yi-Fu Wu, Jaesik Yoon, Sungjin Ahn,
- Abstract summary: We propose a transformer-based Model-Based Reinforcement Learning agent, called TransDreamer.
We first introduce the Transformer State-Space Model, a world model that leverages a transformer for dynamics predictions. We then share this world model with a transformer-based policy network and obtain stability in training a transformer-based RL agent.
In experiments, we apply the proposed model to 2D visual RL and 3D first-person visual RL tasks both requiring long-range memory access for memory-based reasoning. We show that the proposed model outperforms Dreamer in these complex tasks.
- Score: 33.34909288732319
- License:
- Abstract: The Dreamer agent provides various benefits of Model-Based Reinforcement Learning (MBRL) such as sample efficiency, reusable knowledge, and safe planning. However, its world model and policy networks inherit the limitations of recurrent neural networks and thus an important question is how an MBRL framework can benefit from the recent advances of transformers and what the challenges are in doing so. In this paper, we propose a transformer-based MBRL agent, called TransDreamer. We first introduce the Transformer State-Space Model, a world model that leverages a transformer for dynamics predictions. We then share this world model with a transformer-based policy network and obtain stability in training a transformer-based RL agent. In experiments, we apply the proposed model to 2D visual RL and 3D first-person visual RL tasks both requiring long-range memory access for memory-based reasoning. We show that the proposed model outperforms Dreamer in these complex tasks.
Related papers
- Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement [41.7426496795769]
We propose Meta Decision Transformer (Meta-DT) to achieve efficient generalization in offline meta-RL.
We pretrain a context-aware world model to learn a compact task representation, and inject it as a contextual condition to guide task-oriented sequence generation.
We show that Meta-DT exhibits superior few and zero-shot generalization capacity compared to strong baselines.
arXiv Detail & Related papers (2024-10-15T09:51:30Z) - Comprehensive Performance Modeling and System Design Insights for Foundation Models [1.4455936781559149]
Generative AI, in particular large transformer models, are increasingly driving HPC system design in science and industry.
We analyze performance characteristics of such transformer models and discuss their sensitivity to the transformer type, parallelization strategy, and HPC system features.
Our analysis emphasizes the need for closer performance modeling of different transformer types keeping system features in mind.
arXiv Detail & Related papers (2024-09-30T22:56:42Z) - Transformers in Reinforcement Learning: A Survey [7.622978576824539]
Transformers have impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks.
This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability, and partial observability.
arXiv Detail & Related papers (2023-07-12T07:51:12Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Preference Transformer: Modeling Human Preferences using Transformers
for RL [165.33887165572128]
Preference Transformer is a neural architecture that models human preferences using transformers.
We show that Preference Transformer can solve a variety of control tasks using real human preferences, while prior approaches fail to work.
arXiv Detail & Related papers (2023-03-02T04:24:29Z) - A Survey on Transformers in Reinforcement Learning [66.23773284875843]
Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings.
Recently, a similar surge of using Transformers has appeared in the domain of reinforcement learning (RL), but it is faced with unique design choices and challenges brought by the nature of RL.
This paper systematically reviews motivations and progress on using Transformers in RL, provide a taxonomy on existing works, discuss each sub-field, and summarize future prospects.
arXiv Detail & Related papers (2023-01-08T14:04:26Z) - On Transforming Reinforcement Learning by Transformer: The Development
Trajectory [97.79247023389445]
Transformer, originally devised for natural language processing, has also attested significant success in computer vision.
We group existing developments in two categories: architecture enhancement and trajectory optimization.
We examine the main applications of TRL in robotic manipulation, text-based games, navigation and autonomous driving.
arXiv Detail & Related papers (2022-12-29T03:15:59Z) - Decision Transformer: Reinforcement Learning via Sequence Modeling [102.86873656751489]
We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem.
We present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
arXiv Detail & Related papers (2021-06-02T17:53:39Z) - AutoTrans: Automating Transformer Design via Reinforced Architecture
Search [52.48985245743108]
This paper empirically explore how to set layer-norm, whether to scale, number of layers, number of heads, activation function, etc, so that one can obtain a transformer architecture that better suits the tasks at hand.
Experiments on the CoNLL03, Multi-30k, IWSLT14 and WMT-14 shows that the searched transformer model can outperform the standard transformers.
arXiv Detail & Related papers (2020-09-04T08:46:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.