Making Curiosity Explicit in Vision-based RL
- URL: http://arxiv.org/abs/2109.13588v1
- Date: Tue, 28 Sep 2021 09:50:37 GMT
- Title: Making Curiosity Explicit in Vision-based RL
- Authors: Elie Aljalbout and Maximilian Ulmer and Rudolph Triebel
- Abstract summary: Vision-based reinforcement learning (RL) is a promising technique to solve control tasks involving images as the main observation.
State-of-the-art RL algorithms still struggle in terms of sample efficiency.
We present an approach to improve the sample diversity.
- Score: 12.829056201510994
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Vision-based reinforcement learning (RL) is a promising technique to solve
control tasks involving images as the main observation. State-of-the-art RL
algorithms still struggle in terms of sample efficiency, especially when using
image observations. This has led to an increased attention on integrating state
representation learning (SRL) techniques into the RL pipeline. Work in this
field demonstrates a substantial improvement in sample efficiency among other
benefits. However, to take full advantage of this paradigm, the quality of
samples used for training plays a crucial role. More importantly, the diversity
of these samples could affect the sample efficiency of vision-based RL, but
also its generalization capability. In this work, we present an approach to
improve the sample diversity. Our method enhances the exploration capability of
the RL algorithms by taking advantage of the SRL setup. Our experiments show
that the presented approach outperforms the baseline for all tested
environments. These results are most apparent for environments where the
baseline method struggles. Even in simple environments, our method stabilizes
the training, reduces the reward variance and boosts sample efficiency.
Related papers
- Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z) - Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms.
It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL.
This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z) - A Comprehensive Survey of Data Augmentation in Visual Reinforcement Learning [53.35317176453194]
Data augmentation (DA) has become a widely used technique in visual RL for acquiring sample-efficient and generalizable policies.
We present a principled taxonomy of the existing augmentation techniques used in visual RL and conduct an in-depth discussion on how to better leverage augmented data.
As the first comprehensive survey of DA in visual RL, this work is expected to offer valuable guidance to this emerging field.
arXiv Detail & Related papers (2022-10-10T11:01:57Z) - Light-weight probing of unsupervised representations for Reinforcement Learning [20.638410483549706]
We study whether linear probing can be a proxy evaluation task for the quality of unsupervised RL representation.
We show that the probing tasks are strongly rank correlated with the downstream RL performance on the Atari100k Benchmark.
This provides a more efficient method for exploring the space of pretraining algorithms and identifying promising pretraining recipes.
arXiv Detail & Related papers (2022-08-25T21:08:01Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - CCLF: A Contrastive-Curiosity-Driven Learning Framework for
Sample-Efficient Reinforcement Learning [56.20123080771364]
We develop a model-agnostic Contrastive-Curiosity-Driven Learning Framework (CCLF) for reinforcement learning.
CCLF fully exploit sample importance and improve learning efficiency in a self-supervised manner.
We evaluate this approach on the DeepMind Control Suite, Atari, and MiniGrid benchmarks.
arXiv Detail & Related papers (2022-05-02T14:42:05Z) - Mask-based Latent Reconstruction for Reinforcement Learning [58.43247393611453]
Mask-based Latent Reconstruction (MLR) is proposed to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels.
Extensive experiments show that our MLR significantly improves the sample efficiency in deep reinforcement learning.
arXiv Detail & Related papers (2022-01-28T13:07:11Z) - Seeking Visual Discomfort: Curiosity-driven Representations for
Reinforcement Learning [12.829056201510994]
We present an approach to improve sample diversity for state representation learning.
Our proposed approach boosts the visitation of problematic states, improves the learned state representation, and outperforms the baselines for all tested environments.
arXiv Detail & Related papers (2021-10-02T11:15:04Z) - Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under
Data Augmentation [25.493902939111265]
We investigate causes of instability when using data augmentation in off-policy Reinforcement Learning algorithms.
We propose a simple yet effective technique for stabilizing this class of algorithms under augmentation.
Our method greatly improves stability and sample efficiency of ConvNets under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL.
arXiv Detail & Related papers (2021-07-01T17:58:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.