Related papers: Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

Automatic Data Augmentation for Generalization in Deep Reinforcement Learning

URL: http://arxiv.org/abs/2006.12862v2
Date: Sat, 20 Feb 2021 12:32:59 GMT
Title: Automatic Data Augmentation for Generalization in Deep Reinforcement Learning
Authors: Roberta Raileanu, Max Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Abstract summary: Deep reinforcement learning (RL) agents often fail to generalize to unseen scenarios. Data augmentation has recently been shown to improve the sample efficiency and generalization of RL agents. We show that our agent learns policies and representations that are more robust to changes in the environment that do not affect the agent.
Score: 39.477038093585726
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep reinforcement learning (RL) agents often fail to generalize to unseen scenarios, even when they are trained on many instances of semantically similar environments. Data augmentation has recently been shown to improve the sample efficiency and generalization of RL agents. However, different tasks tend to benefit from different kinds of data augmentation. In this paper, we compare three approaches for automatically finding an appropriate augmentation. These are combined with two novel regularization terms for the policy and value function, required to make the use of data augmentation theoretically sound for certain actor-critic algorithms. We evaluate our methods on the Procgen benchmark which consists of 16 procedurally-generated environments and show that it improves test performance by ~40% relative to standard RL algorithms. Our agent outperforms other baselines specifically designed to improve generalization in RL. In addition, we show that our agent learns policies and representations that are more robust to changes in the environment that do not affect the agent, such as the background. Our implementation is available at https://github.com/rraileanu/auto-drac.

Related papers

RLDG: Robotic Generalist Policy Distillation via Reinforcement Learning [53.8293458872774]
We propose Reinforcement Learning Distilled Generalists (RLDG) to generate high-quality training data for finetuning generalist policies. We demonstrate that generalist policies trained with RL-generated data consistently outperform those trained with human demonstrations. Our results suggest that combining task-specific RL with generalist policy distillation offers a promising approach for developing more capable and efficient robotic manipulation systems.
arXiv Detail & Related papers (2024-12-13T04:57:55Z)
Prioritized Generative Replay [121.83947140497655]
We propose a prioritized, parametric version of an agent's memory, using generative models to capture online experience. This paradigm enables densification of past experience, with new generations that benefit from the generative model's generalization capacity. We show this recipe can be instantiated using conditional diffusion models and simple relevance functions.
arXiv Detail & Related papers (2024-10-23T17:59:52Z)
A Recipe for Unbounded Data Augmentation in Visual Reinforcement Learning [12.889687274108248]
A Q-learning algorithm is prone to overfitting and training instabilities when trained from visual observations. We propose a generalized recipe, SADA, that works with wider varieties of augmentations. We find that our method, SADA, greatly improves training stability and generalization of RL agents across a diverse set of augmentations.
arXiv Detail & Related papers (2024-05-27T17:58:23Z)
Supplementing Gradient-Based Reinforcement Learning with Simple Evolutionary Ideas [4.873362301533824]
We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL) The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space.
arXiv Detail & Related papers (2023-05-10T09:46:53Z)
Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment. We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent. We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z)
Retrieval-Augmented Reinforcement Learning [63.32076191982944]
We train a network to map a dataset of past experiences to optimal behavior. The retrieval process is trained to retrieve information from the dataset that may be useful in the current context. We show that retrieval-augmented R2D2 learns significantly faster than the baseline R2D2 agent and achieves higher scores.
arXiv Detail & Related papers (2022-02-17T02:44:05Z)
Stabilizing Deep Q-Learning with ConvNets and Vision Transformers under Data Augmentation [25.493902939111265]
We investigate causes of instability when using data augmentation in off-policy Reinforcement Learning algorithms. We propose a simple yet effective technique for stabilizing this class of algorithms under augmentation. Our method greatly improves stability and sample efficiency of ConvNets under augmentation, and achieves generalization results competitive with state-of-the-art methods for image-based RL.
arXiv Detail & Related papers (2021-07-01T17:58:05Z)
Generalization of Reinforcement Learning with Policy-Aware Adversarial Data Augmentation [32.70482982044965]
We propose a novel policy-aware adversarial data augmentation method to augment the standard policy learning method with automatically generated trajectory data. We conduct experiments on a number of RL tasks to investigate the generalization performance of the proposed method. The results show our method can generalize well with limited training diversity, and achieve the state-of-the-art generalization test performance.
arXiv Detail & Related papers (2021-06-29T17:21:59Z)
Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs. We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z)
Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents. We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks. This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.