Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning
- URL: http://arxiv.org/abs/2305.16379v2
- Date: Fri, 27 Oct 2023 10:13:50 GMT
- Title: Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning
- Authors: Guozheng Ma, Linrui Zhang, Haoyu Wang, Lu Li, Zilin Wang, Zhen Wang,
Li Shen, Xueqian Wang, Dacheng Tao
- Abstract summary: Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms.
It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL.
This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
- Score: 57.83232242068982
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data augmentation (DA) is a crucial technique for enhancing the sample
efficiency of visual reinforcement learning (RL) algorithms. Notably, employing
simple observation transformations alone can yield outstanding performance
without extra auxiliary representation tasks or pre-trained encoders. However,
it remains unclear which attributes of DA account for its effectiveness in
achieving sample-efficient visual RL. To investigate this issue and further
explore the potential of DA, this work conducts comprehensive experiments to
assess the impact of DA's attributes on its efficacy and provides the following
insights and improvements: (1) For individual DA operations, we reveal that
both ample spatial diversity and slight hardness are indispensable. Building on
this finding, we introduce Random PadResize (Rand PR), a new DA operation that
offers abundant spatial diversity with minimal hardness. (2) For multi-type DA
fusion schemes, the increased DA hardness and unstable data distribution result
in the current fusion schemes being unable to achieve higher sample efficiency
than their corresponding individual operations. Taking the non-stationary
nature of RL into account, we propose a RL-tailored multi-type DA fusion scheme
called Cycling Augmentation (CycAug), which performs periodic cycles of
different DA operations to increase type diversity while maintaining data
distribution consistency. Extensive evaluations on the DeepMind Control suite
and CARLA driving simulator demonstrate that our methods achieve superior
sample efficiency compared with the prior state-of-the-art methods.
Related papers
- EntAugment: Entropy-Driven Adaptive Data Augmentation Framework for Image Classification [10.334396596691048]
We propose EntAugment, a tuning-free and adaptive DA framework.
It dynamically assesses and adjusts the augmentation magnitudes for each sample during training.
We also introduce a novel entropy regularization term, EntLoss, which complements the EntAugment approach.
arXiv Detail & Related papers (2024-09-10T07:42:47Z) - Sample Efficient Myopic Exploration Through Multitask Reinforcement
Learning with Diverse Tasks [53.44714413181162]
This paper shows that when an agent is trained on a sufficiently diverse set of tasks, a generic policy-sharing algorithm with myopic exploration design can be sample-efficient.
To the best of our knowledge, this is the first theoretical demonstration of the "exploration benefits" of MTRL.
arXiv Detail & Related papers (2024-03-03T22:57:44Z) - Towards Efficient Deep Hashing Retrieval: Condensing Your Data via
Feature-Embedding Matching [7.908244841289913]
The expenses involved in training state-of-the-art deep hashing retrieval models have witnessed an increase.
The state-of-the-art dataset distillation methods can not expand to all deep hashing retrieval methods.
We propose an efficient condensation framework that addresses these limitations by matching the feature-embedding between synthetic set and real set.
arXiv Detail & Related papers (2023-05-29T13:23:55Z) - Latent Variable Representation for Reinforcement Learning [131.03944557979725]
It remains unclear theoretically and empirically how latent variable models may facilitate learning, planning, and exploration to improve the sample efficiency of model-based reinforcement learning.
We provide a representation view of the latent variable models for state-action value functions, which allows both tractable variational learning algorithm and effective implementation of the optimism/pessimism principle.
In particular, we propose a computationally efficient planning algorithm with UCB exploration by incorporating kernel embeddings of latent variable models.
arXiv Detail & Related papers (2022-12-17T00:26:31Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited
Data [125.7135706352493]
Generative adversarial networks (GANs) typically require ample data for training in order to synthesize high-fidelity images.
Recent studies have shown that training GANs with limited data remains formidable due to discriminator overfitting.
This paper introduces a novel strategy called Adaptive Pseudo Augmentation (APA) to encourage healthy competition between the generator and the discriminator.
arXiv Detail & Related papers (2021-11-12T18:13:45Z) - Seeking Visual Discomfort: Curiosity-driven Representations for
Reinforcement Learning [12.829056201510994]
We present an approach to improve sample diversity for state representation learning.
Our proposed approach boosts the visitation of problematic states, improves the learned state representation, and outperforms the baselines for all tested environments.
arXiv Detail & Related papers (2021-10-02T11:15:04Z) - Making Curiosity Explicit in Vision-based RL [12.829056201510994]
Vision-based reinforcement learning (RL) is a promising technique to solve control tasks involving images as the main observation.
State-of-the-art RL algorithms still struggle in terms of sample efficiency.
We present an approach to improve the sample diversity.
arXiv Detail & Related papers (2021-09-28T09:50:37Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Generalization in Reinforcement Learning by Soft Data Augmentation [11.752595047069505]
SOft Data Augmentation (SODA) is a method that decouples augmentation from policy learning.
We find SODA to significantly advance sample efficiency, generalization, and stability in training over state-of-the-art vision-based RL methods.
arXiv Detail & Related papers (2020-11-26T17:00:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.