Deep Transformer Q-Networks for Partially Observable Reinforcement
Learning
- URL: http://arxiv.org/abs/2206.01078v1
- Date: Thu, 2 Jun 2022 15:04:18 GMT
- Title: Deep Transformer Q-Networks for Partially Observable Reinforcement
Learning
- Authors: Kevin Esslinger, Robert Platt, Christopher Amato
- Abstract summary: Deep Transformer Q-Networks (DTQN) is a novel architecture utilizing transformers and self-attention to encode an agent's history.
Our experiments demonstrate the transformer can solve partially observable tasks faster and more stably than previous recurrent approaches.
- Score: 14.126617899983097
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-world reinforcement learning tasks often involve some form of partial
observability where the observations only give a partial or noisy view of the
true state of the world. Such tasks typically require some form of memory,
where the agent has access to multiple past observations, in order to perform
well. One popular way to incorporate memory is by using a recurrent neural
network to access the agent's history. However, recurrent neural networks in
reinforcement learning are often fragile and difficult to train, susceptible to
catastrophic forgetting and sometimes fail completely as a result. In this
work, we propose Deep Transformer Q-Networks (DTQN), a novel architecture
utilizing transformers and self-attention to encode an agent's history. DTQN is
designed modularly, and we compare results against several modifications to our
base model. Our experiments demonstrate the transformer can solve partially
observable tasks faster and more stably than previous recurrent approaches.
Related papers
- Improving the Trainability of Deep Neural Networks through Layerwise
Batch-Entropy Regularization [1.3999481573773072]
We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network.
We show that we can train a "vanilla" fully connected network and convolutional neural network with 500 layers by simply adding the batch-entropy regularization term to the loss function.
arXiv Detail & Related papers (2022-08-01T20:31:58Z) - A new hope for network model generalization [66.5377859849467]
Generalizing machine learning models for network traffic dynamics tends to be considered a lost cause.
An ML architecture called_Transformer_ has enabled previously unimaginable generalization in other domains.
We propose a Network Traffic Transformer (NTT) to learn network dynamics from packet traces.
arXiv Detail & Related papers (2022-07-12T21:16:38Z) - Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past.
We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning.
We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z) - Transfer Learning via Test-Time Neural Networks Aggregation [11.42582922543676]
It has been demonstrated that deep neural networks outperform traditional machine learning.
Deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution.
arXiv Detail & Related papers (2022-06-27T15:46:05Z) - Learning Fast and Slow for Online Time Series Forecasting [76.50127663309604]
Fast and Slow learning Networks (FSNet) is a holistic framework for online time-series forecasting.
FSNet balances fast adaptation to recent changes and retrieving similar old knowledge.
Our code will be made publicly available.
arXiv Detail & Related papers (2022-02-23T18:23:07Z) - Less is More: Pay Less Attention in Vision Transformers [61.05787583247392]
Less attention vIsion Transformer builds upon the fact that convolutions, fully-connected layers, and self-attentions have almost equivalent mathematical expressions for processing image patch sequences.
The proposed LIT achieves promising performance on image recognition tasks, including image classification, object detection and instance segmentation.
arXiv Detail & Related papers (2021-05-29T05:26:07Z) - Least Redundant Gated Recurrent Neural Network [0.0]
We introduce a recurrent neural architecture called Deep Memory Update (DMU)
It is based on updating the previous memory state with a deep transformation of the lagged state and the network input.
Its training is stable and fast due to relating its learning rate to the size of the module.
arXiv Detail & Related papers (2021-05-28T20:24:00Z) - Thinking Deeply with Recurrence: Generalizing from Easy to Hard
Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models.
We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z) - Implicit recurrent networks: A novel approach to stationary input
processing with recurrent neural networks in deep learning [0.0]
In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning.
We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks.
A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
arXiv Detail & Related papers (2020-10-20T18:55:32Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z) - Incremental Training of a Recurrent Neural Network Exploiting a
Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning.
We show how to extend the architecture of a simple RNN by separating its hidden state into different modules.
We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.