Non-decreasing Quantile Function Network with Efficient Exploration for
Distributional Reinforcement Learning
- URL: http://arxiv.org/abs/2105.06696v1
- Date: Fri, 14 May 2021 08:12:51 GMT
- Title: Non-decreasing Quantile Function Network with Efficient Exploration for
Distributional Reinforcement Learning
- Authors: Fan Zhou, Zhoufan Zhu, Qi Kuang, Liwen Zhang
- Abstract summary: We first propose a non-decreasing quantile function network (NDQFN) to guarantee the monotonicity of the obtained quantile estimates.
We then design a general exploration framework called distributional prediction error (DPE) which utilizes the entire distribution of the quantile function.
- Score: 14.967168108174466
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although distributional reinforcement learning (DRL) has been widely examined
in the past few years, there are two open questions people are still trying to
address. One is how to ensure the validity of the learned quantile function,
the other is how to efficiently utilize the distribution information. This
paper attempts to provide some new perspectives to encourage the future
in-depth studies in these two fields. We first propose a non-decreasing
quantile function network (NDQFN) to guarantee the monotonicity of the obtained
quantile estimates and then design a general exploration framework called
distributional prediction error (DPE) for DRL which utilizes the entire
distribution of the quantile function. In this paper, we not only discuss the
theoretical necessity of our method but also show the performance gain it
achieves in practice by comparing with some competitors on Atari 2600 Games
especially in some hard-explored games.
Related papers
- Generalized Cauchy-Schwarz Divergence and Its Deep Learning Applications [37.349358118385155]
Divergence measures play a central role and become increasingly essential in deep learning.
We introduce a new measure tailored for multiple distributions named the generalized Cauchy-Schwarz divergence (GCSD)
arXiv Detail & Related papers (2024-05-07T07:07:44Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Active search and coverage using point-cloud reinforcement learning [50.741409008225766]
This paper presents an end-to-end deep reinforcement learning solution for target search and coverage.
We show that deep hierarchical feature learning works for RL and that by using farthest point sampling (FPS) we can reduce the amount of points.
We also show that multi-head attention for point-clouds helps to learn the agent faster but converges to the same outcome.
arXiv Detail & Related papers (2023-12-18T18:16:30Z) - Learning General World Models in a Handful of Reward-Free Deployments [53.06205037827802]
Building generally capable agents is a grand challenge for deep reinforcement learning (RL)
We present CASCADE, a novel approach for self-supervised exploration in this new setting.
We show that CASCADE collects diverse task-agnostic datasets and learns agents that zero-shot to novel, unseen downstream tasks.
arXiv Detail & Related papers (2022-10-23T12:38:03Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Exploration with Multi-Sample Target Values for Distributional
Reinforcement Learning [20.680417111485305]
We introduce multi-sample target values (MTV) for distributional RL, as a principled replacement for single-sample target value estimation.
The improved distributional estimates lend themselves to UCB-based exploration.
We evaluate our approach on a range of continuous control tasks and demonstrate state-of-the-art model-free performance on difficult tasks such as Humanoid control.
arXiv Detail & Related papers (2022-02-06T03:27:05Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - KL Guided Domain Adaptation [88.19298405363452]
Domain adaptation is an important problem and often needed for real-world applications.
A common approach in the domain adaptation literature is to learn a representation of the input that has the same distributions over the source and the target domain.
We show that with a probabilistic representation network, the KL term can be estimated efficiently via minibatch samples.
arXiv Detail & Related papers (2021-06-14T22:24:23Z) - Distributed Deep Reinforcement Learning: An Overview [0.0]
In this article, we provide a survey of the role of the distributed approaches in DRL.
We overview the state of the field, by studying the key research works that have a significant impact on how we can use distributed methods in DRL.
Also, we evaluate these methods on different tasks and compare their performance with each other and with single actor and learner agents.
arXiv Detail & Related papers (2020-11-22T13:24:35Z) - Diversity Helps: Unsupervised Few-shot Learning via Distribution
Shift-based Data Augmentation [21.16237189370515]
Few-shot learning aims to learn a new concept when only a few training examples are available.
In this paper, we develop a novel framework called Unsupervised Few-shot Learning via Distribution Shift-based Data Augmentation.
In experiments, few-shot models learned by ULDA can achieve superior generalization performance.
arXiv Detail & Related papers (2020-04-13T07:41:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.