A Survey of Generalisation in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2111.09794v1
- Date: Thu, 18 Nov 2021 16:53:02 GMT
- Title: A Survey of Generalisation in Deep Reinforcement Learning
- Authors: Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rockt\"aschel
- Abstract summary: Generalisation in deep Reinforcement Learning aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time.
Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios.
This survey is an overview of this nascent field.
- Score: 18.098133342169646
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The study of generalisation in deep Reinforcement Learning (RL) aims to
produce RL algorithms whose policies generalise well to novel unseen situations
at deployment time, avoiding overfitting to their training environments.
Tackling this is vital if we are to deploy reinforcement learning algorithms in
real world scenarios, where the environment will be diverse, dynamic and
unpredictable. This survey is an overview of this nascent field. We provide a
unifying formalism and terminology for discussing different generalisation
problems, building upon previous works. We go on to categorise existing
benchmarks for generalisation, as well as current methods for tackling the
generalisation problem. Finally, we provide a critical discussion of the
current state of the field, including recommendations for future work. Among
other conclusions, we argue that taking a purely procedural content generation
approach to benchmark design is not conducive to progress in generalisation, we
suggest fast online adaptation and tackling RL-specific problems as some areas
for future work on methods for generalisation, and we recommend building
benchmarks in underexplored problem settings such as offline RL generalisation
and reward-function variation.
Related papers
- Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning [4.902544998453533]
We argue that understanding and utilizing contextual cues, such as the gravity level of the environment, is critical for robust generalization.
Our algorithm demonstrates improved generalization on various simulated domains, outperforming prior context-learning techniques in zero-shot settings.
arXiv Detail & Related papers (2024-04-15T07:31:48Z) - Discovering General Reinforcement Learning Algorithms with Adversarial
Environment Design [54.39859618450935]
We show that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks.
Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), there remains a gap when these algorithms are applied to unseen environments.
In this work, we examine how characteristics of the meta-supervised-training distribution impact the performance of these algorithms.
arXiv Detail & Related papers (2023-10-04T12:52:56Z) - RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization [23.417092819516185]
We introduce RL-ViGen: a novel Reinforcement Learning Benchmark for Visual Generalization.
RL-ViGen contains diverse tasks and a wide spectrum of generalization types, thereby facilitating the derivation of more reliable conclusions.
Our aspiration is that RL-ViGen will serve as a catalyst in the future creation of universal visual generalization RL agents.
arXiv Detail & Related papers (2023-07-15T05:45:37Z) - On the Importance of Exploration for Generalization in Reinforcement
Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty.
Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - A Survey on Deep Reinforcement Learning-based Approaches for Adaptation
and Generalization [3.307203784120634]
Deep Reinforcement Learning (DRL) aims to create intelligent agents that can learn to solve complex problems efficiently in a real-world environment.
This paper presents a survey on the recent developments in DRL-based approaches for adaptation and generalization.
arXiv Detail & Related papers (2022-02-17T04:29:08Z) - Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner.
We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z) - Policy Mirror Descent for Regularized Reinforcement Learning: A
Generalized Framework with Linear Convergence [60.20076757208645]
This paper proposes a general policy mirror descent (GPMD) algorithm for solving regularized RL.
We demonstrate that our algorithm converges linearly over an entire range learning rates, in a dimension-free fashion, to the global solution.
arXiv Detail & Related papers (2021-05-24T02:21:34Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.