Faster Deep Reinforcement Learning with Slower Online Network
- URL: http://arxiv.org/abs/2112.05848v3
- Date: Mon, 17 Apr 2023 19:17:37 GMT
- Title: Faster Deep Reinforcement Learning with Slower Online Network
- Authors: Kavosh Asadi, Rasool Fakoor, Omer Gottesman, Taesup Kim, Michael L.
Littman, Alexander J. Smola
- Abstract summary: We endow two popular deep reinforcement learning algorithms, namely DQN and Rainbow, with updates that incentivize the online network to remain in the proximity of the target network.
The resultant agents, called DQN Pro and Rainbow Pro, exhibit significant performance improvements over their original counterparts on the Atari benchmark.
- Score: 90.34900072689618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning algorithms often use two networks for value
function optimization: an online network, and a target network that tracks the
online network with some delay. Using two separate networks enables the agent
to hedge against issues that arise when performing bootstrapping. In this paper
we endow two popular deep reinforcement learning algorithms, namely DQN and
Rainbow, with updates that incentivize the online network to remain in the
proximity of the target network. This improves the robustness of deep
reinforcement learning in presence of noisy updates. The resultant agents,
called DQN Pro and Rainbow Pro, exhibit significant performance improvements
over their original counterparts on the Atari benchmark demonstrating the
effectiveness of this simple idea in deep reinforcement learning. The code for
our paper is available here:
Github.com/amazon-research/fast-rl-with-slow-updates.
Related papers
- Simplifying Deep Temporal Difference Learning [3.458933902627673]
We investigate whether it is possible to accelerate and simplify TD training while maintaining its stability.
Our key theoretical result demonstrates for the first time that regularisation techniques such as LayerNorm can yield provably convergent TD algorithms.
Motivated by these findings, we propose PQN, our simplified deep online Q-Learning algorithm.
arXiv Detail & Related papers (2024-07-05T18:49:07Z) - Tempo: Confidentiality Preservation in Cloud-Based Neural Network
Training [8.187538747666203]
Cloud deep learning platforms provide cost-effective deep neural network (DNN) training for customers who lack computation resources.
Recently, researchers have sought to protect data privacy in deep learning by leveraging CPU trusted execution environments (TEEs)
This paper presents Tempo, the first cloud-based deep learning system that cooperates with TEE and distributed GPU.
arXiv Detail & Related papers (2024-01-21T15:57:04Z) - Dynamic Sparse Training for Deep Reinforcement Learning [36.66889208433228]
We propose for the first time to dynamically train deep reinforcement learning agents with sparse neural networks from scratch.
Our approach is easy to be integrated into existing deep reinforcement learning algorithms.
We evaluate our approach on OpenAI gym continuous control tasks.
arXiv Detail & Related papers (2021-06-08T09:57:20Z) - Training Larger Networks for Deep Reinforcement Learning [18.193180866998333]
We show that naively increasing network capacity does not improve performance.
We propose a novel method that consists of 1) wider networks with DenseNet connection, 2) decoupling representation learning from training of RL, and 3) a distributed training method to mitigate overfitting problems.
Using this three-fold technique, we show that we can train very large networks that result in significant performance gains.
arXiv Detail & Related papers (2021-02-16T02:16:54Z) - Sparsity in Deep Learning: Pruning and growth for efficient inference
and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices.
We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z) - ShiftAddNet: A Hardware-Inspired Deep Network [87.18216601210763]
ShiftAddNet is an energy-efficient multiplication-less deep neural network.
It leads to both energy-efficient inference and training, without compromising expressive capacity.
ShiftAddNet aggressively reduces over 80% hardware-quantified energy cost of DNNs training and inference, while offering comparable or better accuracies.
arXiv Detail & Related papers (2020-10-24T05:09:14Z) - Hardware Accelerator for Adversarial Attacks on Deep Learning Neural
Networks [7.20382137043754]
A class of adversarial attack network algorithms has been proposed to generate robust physical perturbations.
In this paper, we propose the first hardware accelerator for adversarial attacks based on memristor crossbar arrays.
arXiv Detail & Related papers (2020-08-03T21:55:41Z) - Fully Convolutional Networks for Continuous Sign Language Recognition [83.85895472824221]
Continuous sign language recognition is a challenging task that requires learning on both spatial and temporal dimensions.
We propose a fully convolutional network (FCN) for online SLR to concurrently learn spatial and temporal features from weakly annotated video sequences.
arXiv Detail & Related papers (2020-07-24T08:16:37Z) - Learning to Hash with Graph Neural Networks for Recommender Systems [103.82479899868191]
Graph representation learning has attracted much attention in supporting high quality candidate search at scale.
Despite its effectiveness in learning embedding vectors for objects in the user-item interaction network, the computational costs to infer users' preferences in continuous embedding space are tremendous.
We propose a simple yet effective discrete representation learning framework to jointly learn continuous and discrete codes.
arXiv Detail & Related papers (2020-03-04T06:59:56Z) - Learn2Perturb: an End-to-end Feature Perturbation Learning to Improve
Adversarial Robustness [79.47619798416194]
Learn2Perturb is an end-to-end feature perturbation learning approach for improving the adversarial robustness of deep neural networks.
Inspired by the Expectation-Maximization, an alternating back-propagation training algorithm is introduced to train the network and noise parameters consecutively.
arXiv Detail & Related papers (2020-03-02T18:27:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.