Reinforcement Learning with Augmented Data
- URL: http://arxiv.org/abs/2004.14990v5
- Date: Thu, 5 Nov 2020 06:04:50 GMT
- Title: Reinforcement Learning with Augmented Data
- Authors: Michael Laskin, Kimin Lee, Adam Stooke, Lerrel Pinto, Pieter Abbeel,
and Aravind Srinivas
- Abstract summary: We present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms.
We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods.
- Score: 97.42819506719191
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning from visual observations is a fundamental yet challenging problem in
Reinforcement Learning (RL). Although algorithmic advances combined with
convolutional neural networks have proved to be a recipe for success, current
methods are still lacking on two fronts: (a) data-efficiency of learning and
(b) generalization to new environments. To this end, we present Reinforcement
Learning with Augmented Data (RAD), a simple plug-and-play module that can
enhance most RL algorithms. We perform the first extensive study of general
data augmentations for RL on both pixel-based and state-based inputs, and
introduce two new data augmentations - random translate and random amplitude
scale. We show that augmentations such as random translate, crop, color jitter,
patch cutout, random convolutions, and amplitude scale can enable simple RL
algorithms to outperform complex state-of-the-art methods across common
benchmarks. RAD sets a new state-of-the-art in terms of data-efficiency and
final performance on the DeepMind Control Suite benchmark for pixel-based
control as well as OpenAI Gym benchmark for state-based control. We further
demonstrate that RAD significantly improves test-time generalization over
existing methods on several OpenAI ProcGen benchmarks. Our RAD module and
training code are available at https://www.github.com/MishaLaskin/rad.
Related papers
- Reinforcement Learning with Token-level Feedback for Controllable Text Generation [16.117006822479407]
We propose a novel reinforcement learning algorithm named TOLE which formulates TOken-LEvel rewards for controllable text generation.
Experimental results show that our algorithm can achieve superior performance on both single-attribute and multi-attribute control tasks.
arXiv Detail & Related papers (2024-03-18T08:18:37Z) - M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL)
Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms.
We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z) - Improving and Benchmarking Offline Reinforcement Learning Algorithms [87.67996706673674]
This work aims to bridge the gaps caused by low-level choices and datasets.
We empirically investigate 20 implementation choices using three representative algorithms.
We find two variants CRR+ and CQL+ achieving new state-of-the-art on D4RL.
arXiv Detail & Related papers (2023-06-01T17:58:46Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under
Data Augmentation [25.493902939111265]
We investigate causes of instability when using data augmentation in off-policy Reinforcement Learning algorithms.
We propose a simple yet effective technique for stabilizing this class of algorithms under augmentation.
Our method greatly improves stability and sample efficiency of ConvNets under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL.
arXiv Detail & Related papers (2021-07-01T17:58:05Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - RL-DARTS: Differentiable Architecture Search for Reinforcement Learning [62.95469460505922]
We introduce RL-DARTS, one of the first applications of Differentiable Architecture Search (DARTS) in reinforcement learning (RL)
By replacing the image encoder with a DARTS supernet, our search method is sample-efficient, requires minimal extra compute resources, and is also compatible with off-policy and on-policy RL algorithms, needing only minor changes in preexisting code.
We show that the supernet gradually learns better cells, leading to alternative architectures which can be highly competitive against manually designed policies, but also verify previous design choices for RL policies.
arXiv Detail & Related papers (2021-06-04T03:08:43Z) - Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? [15.578423102700764]
We propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms.
We show that the RL agents learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations.
arXiv Detail & Related papers (2020-03-03T16:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.