Stabilizing Transformer-Based Action Sequence Generation For Q-Learning
- URL: http://arxiv.org/abs/2010.12698v2
- Date: Fri, 18 Dec 2020 17:16:38 GMT
- Title: Stabilizing Transformer-Based Action Sequence Generation For Q-Learning
- Authors: Gideon Stein, Andrey Filchenkov, Arip Asadulaev
- Abstract summary: The goal is a simple Transformer-based Deep Q-Learning method that is stable over several environments.
The proposed method can match the performance of classic Q-learning on control environments while showing potential on some selected Atari benchmarks.
- Score: 5.707122938235432
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Since the publication of the original Transformer architecture (Vaswani et
al. 2017), Transformers revolutionized the field of Natural Language
Processing. This, mainly due to their ability to understand timely dependencies
better than competing RNN-based architectures. Surprisingly, this architecture
change does not affect the field of Reinforcement Learning (RL), even though
RNNs are quite popular in RL, and time dependencies are very common in RL.
Recently, Parisotto et al. 2019) conducted the first promising research of
Transformers in RL. To support the findings of this work, this paper seeks to
provide an additional example of a Transformer-based RL method. Specifically,
the goal is a simple Transformer-based Deep Q-Learning method that is stable
over several environments. Due to the unstable nature of Transformers and RL,
an extensive method search was conducted to arrive at a final method that
leverages developments around Transformers as well as Q-learning. The proposed
method can match the performance of classic Q-learning on control environments
while showing potential on some selected Atari benchmarks. Furthermore, it was
critically evaluated to give additional insights into the relation between
Transformers and RL.
Related papers
- Rethinking Transformers in Solving POMDPs [47.14499685668683]
This paper scrutinizes the effectiveness of a popular architecture, namely Transformers, in Partially Observable Markov Decision Processes (POMDPs)
Regular languages, which Transformers struggle to model, are reducible to POMDPs.
This poses a significant challenge for Transformers in learning POMDP-specific inductive biases, due to their lack of inherent recurrence found in other models like RNNs.
arXiv Detail & Related papers (2024-05-27T17:02:35Z) - iTransformer: Inverted Transformers Are Effective for Time Series Forecasting [62.40166958002558]
We propose iTransformer, which simply applies the attention and feed-forward network on the inverted dimensions.
The iTransformer model achieves state-of-the-art on challenging real-world datasets.
arXiv Detail & Related papers (2023-10-10T13:44:09Z) - Transformers in Reinforcement Learning: A Survey [7.622978576824539]
Transformers have impacted domains like natural language processing, computer vision, and robotics, where they improve performance compared to other neural networks.
This survey explores how transformers are used in reinforcement learning (RL), where they are seen as a promising solution for addressing challenges such as unstable training, credit assignment, lack of interpretability, and partial observability.
arXiv Detail & Related papers (2023-07-12T07:51:12Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - A Survey on Transformers in Reinforcement Learning [66.23773284875843]
Transformer has been considered the dominating neural architecture in NLP and CV, mostly under supervised settings.
Recently, a similar surge of using Transformers has appeared in the domain of reinforcement learning (RL), but it is faced with unique design choices and challenges brought by the nature of RL.
This paper systematically reviews motivations and progress on using Transformers in RL, provide a taxonomy on existing works, discuss each sub-field, and summarize future prospects.
arXiv Detail & Related papers (2023-01-08T14:04:26Z) - On Transforming Reinforcement Learning by Transformer: The Development
Trajectory [97.79247023389445]
Transformer, originally devised for natural language processing, has also attested significant success in computer vision.
We group existing developments in two categories: architecture enhancement and trajectory optimization.
We examine the main applications of TRL in robotic manipulation, text-based games, navigation and autonomous driving.
arXiv Detail & Related papers (2022-12-29T03:15:59Z) - Transformers learn in-context by gradient descent [58.24152335931036]
Training Transformers on auto-regressive objectives is closely related to gradient-based meta-learning formulations.
We show how trained Transformers become mesa-optimizers i.e. learn models by gradient descent in their forward pass.
arXiv Detail & Related papers (2022-12-15T09:21:21Z) - Decision Transformer: Reinforcement Learning via Sequence Modeling [102.86873656751489]
We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem.
We present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
arXiv Detail & Related papers (2021-06-02T17:53:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.