Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise
- URL: http://arxiv.org/abs/2203.01004v3
- Date: Mon, 24 Jun 2024 15:09:35 GMT
- Title: Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise
- Authors: Li Meng, Morten Goodwin, Anis Yazidi, Paal Engelstad,
- Abstract summary: This article explores the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm.
We find that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games.
- Score: 8.938418994111716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Q-learning is one of the most well-known Reinforcement Learning algorithms. There have been tremendous efforts to develop this algorithm using neural networks. Bootstrapped Deep Q-Learning Network is amongst them. It utilizes multiple neural network heads to introduce diversity into Q-learning. Diversity can sometimes be viewed as the amount of reasonable moves an agent can take at a given state, analogous to the definition of the exploration ratio in RL. Thus, the performance of Bootstrapped Deep Q-Learning Network is deeply connected with the level of diversity within the algorithm. In the original research, it was pointed out that a random prior could improve the performance of the model. In this article, we further explore the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm. We conduct our experiment on the Atari benchmark and compare our algorithm to both the original and other related algorithms. The results show that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games. Thus, we conclude that replacing priors with noise can improve Bootstrapped Deep Q-Learning's performance by ensuring the integrity of diversities.
Related papers
- Layering and subpool exploration for adaptive Variational Quantum
Eigensolvers: Reducing circuit depth, runtime, and susceptibility to noise [0.0]
Adaptive variational quantum eigensolvers (ADAPT-VQEs) are promising candidates for simulations of strongly correlated systems.
Recent efforts have been directed towards compactifying, or layering, their ansatz circuits.
We show that layering leads to an improved noise resilience with respect to amplitude-damping and dephasing noise.
arXiv Detail & Related papers (2023-08-22T18:00:02Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - Learning with Differentiable Algorithms [6.47243430672461]
This thesis explores combining classic algorithms and machine learning systems like neural networks.
The thesis formalizes the idea of algorithmic supervision, which allows a neural network to learn from or in conjunction with an algorithm.
In addition, this thesis proposes differentiable algorithms, such as differentiable sorting networks, differentiable sorting gates, and differentiable logic gate networks.
arXiv Detail & Related papers (2022-09-01T17:30:00Z) - A Continuous Optimisation Benchmark Suite from Neural Network Regression [0.0]
Training neural networks is an optimisation task that has gained prominence with the recent successes of deep learning.
gradient descent variants are by far the most common choice with their trusted good performance on large-scale machine learning tasks.
We contribute CORNN, a suite for benchmarking the performance of any continuous black-box algorithm on neural network training problems.
arXiv Detail & Related papers (2021-09-12T20:24:11Z) - A robust approach for deep neural networks in presence of label noise:
relabelling and filtering instances during training [14.244244290954084]
We propose a robust training strategy against label noise, called RAFNI, that can be used with any CNN.
RAFNI consists of three mechanisms: two mechanisms that filter instances and one mechanism that relabels instances.
We evaluated our algorithm using different data sets of several sizes and characteristics.
arXiv Detail & Related papers (2021-09-08T16:11:31Z) - Evolving Reinforcement Learning Algorithms [186.62294652057062]
We propose a method for meta-learning reinforcement learning algorithms.
The learned algorithms are domain-agnostic and can generalize to new environments not seen during training.
We highlight two learned algorithms which obtain good generalization performance over other classical control tasks, gridworld type tasks, and Atari games.
arXiv Detail & Related papers (2021-01-08T18:55:07Z) - TorchDyn: A Neural Differential Equations Library [16.43439140464003]
We introduce TorchDyn, a PyTorch library dedicated to continuous-depth learning.
It is designed to elevate neural differential equations to be as accessible as regular plug-and-play deep learning primitives.
arXiv Detail & Related papers (2020-09-20T03:45:49Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z) - Learning to Stop While Learning to Predict [85.7136203122784]
Many algorithm-inspired deep models are restricted to a fixed-depth'' for all inputs.
Similar to algorithms, the optimal depth of a deep architecture may be different for different input instances.
In this paper, we tackle this varying depth problem using a steerable architecture.
We show that the learned deep model along with the stopping policy improves the performances on a diverse set of tasks.
arXiv Detail & Related papers (2020-06-09T07:22:01Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - MSE-Optimal Neural Network Initialization via Layer Fusion [68.72356718879428]
Deep neural networks achieve state-of-the-art performance for a range of classification and inference tasks.
The use of gradient combined nonvolutionity renders learning susceptible to novel problems.
We propose fusing neighboring layers of deeper networks that are trained with random variables.
arXiv Detail & Related papers (2020-01-28T18:25:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.