How Crucial is Transformer in Decision Transformer?
- URL: http://arxiv.org/abs/2211.14655v1
- Date: Sat, 26 Nov 2022 20:13:22 GMT
- Title: How Crucial is Transformer in Decision Transformer?
- Authors: Max Siebenborn, Boris Belousov, Junning Huang, Jan Peters
- Abstract summary: Decision Transformer (DT) is a recently proposed architecture for Reinforcement Learning that frames the decision-making process as an auto-regressive sequence modeling problem.
We analyze how crucial the Transformer model is in the complete DT architecture on continuous control tasks.
- Score: 29.228813063916206
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Decision Transformer (DT) is a recently proposed architecture for
Reinforcement Learning that frames the decision-making process as an
auto-regressive sequence modeling problem and uses a Transformer model to
predict the next action in a sequence of states, actions, and rewards. In this
paper, we analyze how crucial the Transformer model is in the complete DT
architecture on continuous control tasks. Namely, we replace the Transformer by
an LSTM model while keeping the other parts unchanged to obtain what we call a
Decision LSTM model. We compare it to DT on continuous control tasks, including
pendulum swing-up and stabilization, in simulation and on physical hardware.
Our experiments show that DT struggles with continuous control problems, such
as inverted pendulum and Furuta pendulum stabilization. On the other hand, the
proposed Decision LSTM is able to achieve expert-level performance on these
tasks, in addition to learning a swing-up controller on the real system. These
results suggest that the strength of the Decision Transformer for continuous
control tasks may lie in the overall sequential modeling architecture and not
in the Transformer per se.
Related papers
- QT-TDM: Planning with Transformer Dynamics Model and Autoregressive Q-Learning [17.914580097058106]
We investigate the use of Transformers in Reinforcement Learning (RL)
We learn an autoregressive discrete Q-function using a separate Q-Transformer model to estimate a long-term return beyond the short-horizon planning.
Our proposed method, QT-TDM, integrates the robust predictive capabilities of Transformers as dynamics models with the efficacy of a model-free Q-Transformer to mitigate the computational burden associated with real-time planning.
arXiv Detail & Related papers (2024-07-26T16:05:26Z) - PIDformer: Transformer Meets Control Theory [28.10913642120948]
We unveil self-attention as an autonomous state-space model that inherently promotes smoothness in its solutions.
We incorporate a Proportional-Integral-Derivative (PID) closed-loop feedback control system with a reference point into the model to improve robustness and representation capacity.
Motivated by this control framework, we derive a novel class of transformers, PID-controlled Transformer (PIDformer)
arXiv Detail & Related papers (2024-02-25T05:04:51Z) - Emergent Agentic Transformer from Chain of Hindsight Experience [96.56164427726203]
We show that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
This is the first time that a simple transformer-based model performs competitively with both temporal-difference and imitation-learning-based approaches.
arXiv Detail & Related papers (2023-05-26T00:43:02Z) - Optimizing Non-Autoregressive Transformers with Contrastive Learning [74.46714706658517]
Non-autoregressive Transformers (NATs) reduce the inference latency of Autoregressive Transformers (ATs) by predicting words all at once rather than in sequential order.
In this paper, we propose to ease the difficulty of modality learning via sampling from the model distribution instead of the data distribution.
arXiv Detail & Related papers (2023-05-23T04:20:13Z) - Continuous Spatiotemporal Transformers [2.485182034310304]
We present the Continuous Stemporal Transformer (CST), a new transformer architecture that is designed to modeling continuous systems.
This new framework guarantees a continuous representation and output via optimization in Sobolev space.
We benchmark CST against traditional transformers as well as other smoothtemporal dynamics modeling methods and achieve superior performance in a number of tasks on synthetic and real systems.
arXiv Detail & Related papers (2023-01-31T00:06:56Z) - Learning Bounded Context-Free-Grammar via LSTM and the
Transformer:Difference and Explanations [51.77000472945441]
Long Short-Term Memory (LSTM) and Transformers are two popular neural architectures used for natural language processing tasks.
In practice, it is often observed that Transformer models have better representation power than LSTM.
We study such practical differences between LSTM and Transformer and propose an explanation based on their latent space decomposition patterns.
arXiv Detail & Related papers (2021-12-16T19:56:44Z) - Decision Transformer: Reinforcement Learning via Sequence Modeling [102.86873656751489]
We present a framework that abstracts Reinforcement Learning (RL) as a sequence modeling problem.
We present Decision Transformer, an architecture that casts the problem of RL as conditional sequence modeling.
Despite its simplicity, Decision Transformer matches or exceeds the performance of state-of-the-art offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.
arXiv Detail & Related papers (2021-06-02T17:53:39Z) - Bayesian Transformer Language Models for Speech Recognition [59.235405107295655]
State-of-the-art neural language models (LMs) represented by Transformers are highly complex.
This paper proposes a full Bayesian learning framework for Transformer LM estimation.
arXiv Detail & Related papers (2021-02-09T10:55:27Z) - Understanding the Difficulty of Training Transformers [120.99980924577787]
We show that unbalanced gradients are not the root cause of the instability of training.
We propose Admin to stabilize the early stage's training and unleash its full potential in the late stage.
arXiv Detail & Related papers (2020-04-17T13:59:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.