Related papers: Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

Deep Transformer Q-Networks for Partially Observable Reinforcement Learning

URL: http://arxiv.org/abs/2206.01078v1
Date: Thu, 2 Jun 2022 15:04:18 GMT
Title: Deep Transformer Q-Networks for Partially Observable Reinforcement Learning
Authors: Kevin Esslinger, Robert Platt, Christopher Amato
Abstract summary: Deep Transformer Q-Networks (DTQN) is a novel architecture utilizing transformers and self-attention to encode an agent's history. Our experiments demonstrate the transformer can solve partially observable tasks faster and more stably than previous recurrent approaches.
Score: 14.126617899983097
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Real-world reinforcement learning tasks often involve some form of partial observability where the observations only give a partial or noisy view of the true state of the world. Such tasks typically require some form of memory, where the agent has access to multiple past observations, in order to perform well. One popular way to incorporate memory is by using a recurrent neural network to access the agent's history. However, recurrent neural networks in reinforcement learning are often fragile and difficult to train, susceptible to catastrophic forgetting and sometimes fail completely as a result. In this work, we propose Deep Transformer Q-Networks (DTQN), a novel architecture utilizing transformers and self-attention to encode an agent's history. DTQN is designed modularly, and we compare results against several modifications to our base model. Our experiments demonstrate the transformer can solve partially observable tasks faster and more stably than previous recurrent approaches.

Related papers

Continual Diffuser (CoD): Mastering Continual Offline Reinforcement Learning with Experience Rehearsal [54.93261535899478]
In real-world applications, such as robotic control of reinforcement learning, the tasks are changing, and new tasks arise in a sequential order. This situation poses the new challenge of plasticity-stability trade-off for training an agent who can adapt to task changes and retain acquired knowledge. We propose a rehearsal-based continual diffusion model, called Continual diffuser (CoD), to endow the diffuser with the capabilities of quick adaptation (plasticity) and lasting retention (stability)
arXiv Detail & Related papers (2024-09-04T08:21:47Z)
Improving the Trainability of Deep Neural Networks through Layerwise Batch-Entropy Regularization [1.3999481573773072]
We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network. We show that we can train a "vanilla" fully connected network and convolutional neural network with 500 layers by simply adding the batch-entropy regularization term to the loss function.
arXiv Detail & Related papers (2022-08-01T20:31:58Z)
A new hope for network model generalization [66.5377859849467]
Generalizing machine learning models for network traffic dynamics tends to be considered a lost cause. An ML architecture called_Transformer_ has enabled previously unimaginable generalization in other domains. We propose a Network Traffic Transformer (NTT) to learn network dynamics from packet traces.
arXiv Detail & Related papers (2022-07-12T21:16:38Z)
Continual Learning with Transformers for Image Classification [12.028617058465333]
In computer vision, neural network models struggle to continually learn new concepts without forgetting what has been learnt in the past. We develop a solution called Adaptive Distillation of Adapters (ADA), which is developed to perform continual learning. We empirically demonstrate on different classification tasks that this method maintains a good predictive performance without retraining the model.
arXiv Detail & Related papers (2022-06-28T15:30:10Z)
Transfer Learning via Test-Time Neural Networks Aggregation [11.42582922543676]
It has been demonstrated that deep neural networks outperform traditional machine learning. Deep networks lack generalisability, that is, they will not perform as good as in a new (testing) set drawn from a different distribution.
arXiv Detail & Related papers (2022-06-27T15:46:05Z)
Learning Fast and Slow for Online Time Series Forecasting [76.50127663309604]
Fast and Slow learning Networks (FSNet) is a holistic framework for online time-series forecasting. FSNet balances fast adaptation to recent changes and retrieving similar old knowledge. Our code will be made publicly available.
arXiv Detail & Related papers (2022-02-23T18:23:07Z)
Least Redundant Gated Recurrent Neural Network [0.0]
We introduce a recurrent neural architecture called Deep Memory Update (DMU) It is based on updating the previous memory state with a deep transformation of the lagged state and the network input. Its training is stable and fast due to relating its learning rate to the size of the module.
arXiv Detail & Related papers (2021-05-28T20:24:00Z)
Thinking Deeply with Recurrence: Generalizing from Easy to Hard Sequential Reasoning Problems [51.132938969015825]
We observe that recurrent networks have the uncanny ability to closely emulate the behavior of non-recurrent deep models. We show that recurrent networks that are trained to solve simple mazes with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference.
arXiv Detail & Related papers (2021-02-22T14:09:20Z)
Implicit recurrent networks: A novel approach to stationary input processing with recurrent neural networks in deep learning [0.0]
In this work, we introduce and test a novel implementation of recurrent neural networks into deep learning. We provide an algorithm which implements the backpropagation algorithm on a implicit implementation of recurrent networks. A single-layer implicit recurrent network is able to solve the XOR problem, while a feed-forward network with monotonically increasing activation function fails at this task.
arXiv Detail & Related papers (2020-10-20T18:55:32Z)
Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network. Our motivation is that models are expected to understand complex dynamics from different sources. Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z)
Incremental Training of a Recurrent Neural Network Exploiting a Multi-Scale Dynamic Memory [79.42778415729475]
We propose a novel incrementally trained recurrent architecture targeting explicitly multi-scale learning. We show how to extend the architecture of a simple RNN by separating its hidden state into different modules. We discuss a training algorithm where new modules are iteratively added to the model to learn progressively longer dependencies.
arXiv Detail & Related papers (2020-06-29T08:35:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.