Generalization in Reinforcement Learning by Soft Data Augmentation
- URL: http://arxiv.org/abs/2011.13389v2
- Date: Fri, 9 Apr 2021 02:29:17 GMT
- Title: Generalization in Reinforcement Learning by Soft Data Augmentation
- Authors: Nicklas Hansen, Xiaolong Wang
- Abstract summary: SOft Data Augmentation (SODA) is a method that decouples augmentation from policy learning.
We find SODA to significantly advance sample efficiency, generalization, and stability in training over state-of-the-art vision-based RL methods.
- Score: 11.752595047069505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Extensive efforts have been made to improve the generalization ability of
Reinforcement Learning (RL) methods via domain randomization and data
augmentation. However, as more factors of variation are introduced during
training, optimization becomes increasingly challenging, and empirically may
result in lower sample efficiency and unstable training. Instead of learning
policies directly from augmented data, we propose SOft Data Augmentation
(SODA), a method that decouples augmentation from policy learning.
Specifically, SODA imposes a soft constraint on the encoder that aims to
maximize the mutual information between latent representations of augmented and
non-augmented data, while the RL optimization process uses strictly
non-augmented data. Empirical evaluations are performed on diverse tasks from
DeepMind Control suite as well as a robotic manipulation task, and we find SODA
to significantly advance sample efficiency, generalization, and stability in
training over state-of-the-art vision-based RL methods.
Related papers
- AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation [12.697608744311122]
AdaAugment is a tuning-free Adaptive Augmentation method.
It dynamically adjusts augmentation magnitudes for individual training samples based on real-time feedback from the target network.
It consistently outperforms other state-of-the-art DA methods in effectiveness while maintaining remarkable efficiency.
arXiv Detail & Related papers (2024-05-19T06:54:03Z) - Understanding when Dynamics-Invariant Data Augmentations Benefit Model-Free Reinforcement Learning Updates [3.5253513747455303]
We identify general aspects of data augmentation (DA) responsible for observed learning improvements.
Our study focuses on sparse-reward tasks with dynamics-invariant data augmentation functions.
arXiv Detail & Related papers (2023-10-26T21:28:50Z) - Incorporating Supervised Domain Generalization into Data Augmentation [4.14360329494344]
We propose a method, contrastive semantic alignment(CSA) loss, to improve robustness and training efficiency of data augmentation.
Experiments on the CIFAR-100 and CUB datasets show that the proposed method improves the robustness and training efficiency of typical data augmentations.
arXiv Detail & Related papers (2023-10-02T09:20:12Z) - Learning Better with Less: Effective Augmentation for Sample-Efficient
Visual Reinforcement Learning [57.83232242068982]
Data augmentation (DA) is a crucial technique for enhancing the sample efficiency of visual reinforcement learning (RL) algorithms.
It remains unclear which attributes of DA account for its effectiveness in achieving sample-efficient visual RL.
This work conducts comprehensive experiments to assess the impact of DA's attributes on its efficacy.
arXiv Detail & Related papers (2023-05-25T15:46:20Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Improving GANs with A Dynamic Discriminator [106.54552336711997]
We argue that a discriminator with an on-the-fly adjustment on its capacity can better accommodate such a time-varying task.
A comprehensive empirical study confirms that the proposed training strategy, termed as DynamicD, improves the synthesis performance without incurring any additional cost or training objectives.
arXiv Detail & Related papers (2022-09-20T17:57:33Z) - CCLF: A Contrastive-Curiosity-Driven Learning Framework for
Sample-Efficient Reinforcement Learning [56.20123080771364]
We develop a model-agnostic Contrastive-Curiosity-Driven Learning Framework (CCLF) for reinforcement learning.
CCLF fully exploit sample importance and improve learning efficiency in a self-supervised manner.
We evaluate this approach on the DeepMind Control Suite, Atari, and MiniGrid benchmarks.
arXiv Detail & Related papers (2022-05-02T14:42:05Z) - Generalization of Reinforcement Learning with Policy-Aware Adversarial
Data Augmentation [32.70482982044965]
We propose a novel policy-aware adversarial data augmentation method to augment the standard policy learning method with automatically generated trajectory data.
We conduct experiments on a number of RL tasks to investigate the generalization performance of the proposed method.
The results show our method can generalize well with limited training diversity, and achieve the state-of-the-art generalization test performance.
arXiv Detail & Related papers (2021-06-29T17:21:59Z) - Data Augmentation for Opcode Sequence Based Malware Detection [2.335152769484957]
We study different methods of data augmentation starting with basic methods using fixed transformations and moving to methods that adapt to the data.
We propose a novel data augmentation method based on using an opcode embedding layer within the network and its corresponding opcode embedding matrix.
To the best of our knowledge this is the first paper to carry out a systematic study of different augmentation methods applied to opcode sequence based malware classification.
arXiv Detail & Related papers (2021-06-22T14:36:35Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z) - Generative Data Augmentation for Commonsense Reasoning [75.26876609249197]
G-DAUGC is a novel generative data augmentation method that aims to achieve more accurate and robust learning in the low-resource setting.
G-DAUGC consistently outperforms existing data augmentation methods based on back-translation.
Our analysis demonstrates that G-DAUGC produces a diverse set of fluent training examples, and that its selection and training approaches are important for performance.
arXiv Detail & Related papers (2020-04-24T06:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.