Langevin DQN
- URL: http://arxiv.org/abs/2002.07282v2
- Date: Tue, 23 Feb 2021 06:09:20 GMT
- Title: Langevin DQN
- Authors: Vikranth Dwaracherla, Benjamin Van Roy
- Abstract summary: We develop an incremental reinforcement learning algorithm that tracks a single point estimate.
We demonstrate through a computational study that the presented algorithm achieves deep exploration.
We also present a modification of the Langevin DQN algorithm to improve the computational efficiency.
- Score: 15.807243762876901
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Algorithms that tackle deep exploration -- an important challenge in
reinforcement learning -- have relied on epistemic uncertainty representation
through ensembles or other hypermodels, exploration bonuses, or visitation
count distributions. An open question is whether deep exploration can be
achieved by an incremental reinforcement learning algorithm that tracks a
single point estimate, without additional complexity required to account for
epistemic uncertainty. We answer this question in the affirmative. In
particular, we develop Langevin DQN, a variation of DQN that differs only in
perturbing parameter updates with Gaussian noise and demonstrate through a
computational study that the presented algorithm achieves deep exploration. We
also offer some intuition to how Langevin DQN achieves deep exploration. In
addition, we present a modification of the Langevin DQN algorithm to improve
the computational efficiency.
Related papers
- Value of Information-Enhanced Exploration in Bootstrapped DQN [2.6173443955754903]
In this paper, we integrate the notion of (expected) value of information (EVOI) within the well-known Bootstrapped DQN algorithmic framework.<n>Specifically, we develop two novel algorithms that incorporate the expected gain from learning the value of information into Bootstrapped DQN.<n>Our experiments in complex, sparse-reward Atari games demonstrate increased performance, all the while making better use of uncertainty.
arXiv Detail & Related papers (2025-11-04T20:22:58Z) - Uncertainty quantification for deeponets with ensemble kalman inversion [0.8158530638728501]
In this work, we propose a novel inference approach for efficient uncertainty quantification (UQ) for operator learning by harnessing the power of the Ensemble Kalman Inversion (EKI) approach.
EKI is known for its derivative-free, noise-robust, and highly parallelizable feature, and has demonstrated its advantages for UQ for physics-informed neural networks.
We deploy a mini-batch variant of EKI to accommodate larger datasets, mitigating the computational demand due to large datasets during the training stage.
arXiv Detail & Related papers (2024-03-06T04:02:30Z) - STEERING: Stein Information Directed Exploration for Model-Based
Reinforcement Learning [111.75423966239092]
We propose an exploration incentive in terms of the integral probability metric (IPM) between a current estimate of the transition model and the unknown optimal.
Based on KSD, we develop a novel algorithm algo: textbfSTEin information dirtextbfEcted exploration for model-based textbfReinforcement LearntextbfING.
arXiv Detail & Related papers (2023-01-28T00:49:28Z) - Rewarding Episodic Visitation Discrepancy for Exploration in
Reinforcement Learning [64.8463574294237]
We propose Rewarding Episodic Visitation Discrepancy (REVD) as an efficient and quantified exploration method.
REVD provides intrinsic rewards by evaluating the R'enyi divergence-based visitation discrepancy between episodes.
It is tested on PyBullet Robotics Environments and Atari games.
arXiv Detail & Related papers (2022-09-19T08:42:46Z) - Density Regression and Uncertainty Quantification with Bayesian Deep
Noise Neural Networks [4.376565880192482]
Deep neural network (DNN) models have achieved state-of-the-art predictive accuracy in a wide range of supervised learning applications.
accurately quantifying the uncertainty in DNN predictions remains a challenging task.
We propose the Bayesian Deep Noise Neural Network (B-DeepNoise), which generalizes standard Bayesian DNNs by extending the random noise variable to all hidden layers.
We evaluate B-DeepNoise against existing methods on benchmark regression datasets, demonstrating its superior performance in terms of prediction accuracy, uncertainty quantification accuracy, and uncertainty quantification efficiency.
arXiv Detail & Related papers (2022-06-12T02:47:29Z) - k-Means Maximum Entropy Exploration [55.81894038654918]
Exploration in continuous spaces with sparse rewards is an open problem in reinforcement learning.
We introduce an artificial curiosity algorithm based on lower bounding an approximation to the entropy of the state visitation distribution.
We show that our approach is both computationally efficient and competitive on benchmarks for exploration in high-dimensional, continuous spaces.
arXiv Detail & Related papers (2022-05-31T09:05:58Z) - Improving the Diversity of Bootstrapped DQN by Replacing Priors With Noise [8.938418994111716]
This article explores the possibility of replacing priors with noise and sample the noise from a Gaussian distribution to introduce more diversity into this algorithm.
We find that our modification of the Bootstrapped Deep Q-Learning algorithm achieves significantly higher evaluation scores across different types of Atari games.
arXiv Detail & Related papers (2022-03-02T10:28:14Z) - Online Limited Memory Neural-Linear Bandits with Likelihood Matching [53.18698496031658]
We study neural-linear bandits for solving problems where both exploration and representation learning play an important role.
We propose a likelihood matching algorithm that is resilient to catastrophic forgetting and is completely online.
arXiv Detail & Related papers (2021-02-07T14:19:07Z) - Variance Reduction for Deep Q-Learning using Stochastic Recursive
Gradient [51.880464915253924]
Deep Q-learning algorithms often suffer from poor gradient estimations with an excessive variance.
This paper introduces the framework for updating the gradient estimates in deep Q-learning, achieving a novel algorithm called SRG-DQN.
arXiv Detail & Related papers (2020-07-25T00:54:20Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.