Related papers: A Survey of Generalisation in Deep Reinforcement Learning

A Survey of Generalisation in Deep Reinforcement Learning

URL: http://arxiv.org/abs/2111.09794v1
Date: Thu, 18 Nov 2021 16:53:02 GMT
Title: A Survey of Generalisation in Deep Reinforcement Learning
Authors: Robert Kirk, Amy Zhang, Edward Grefenstette, Tim Rockt\"aschel
Abstract summary: Generalisation in deep Reinforcement Learning aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios. This survey is an overview of this nascent field.
Score: 18.098133342169646
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The study of generalisation in deep Reinforcement Learning (RL) aims to produce RL algorithms whose policies generalise well to novel unseen situations at deployment time, avoiding overfitting to their training environments. Tackling this is vital if we are to deploy reinforcement learning algorithms in real world scenarios, where the environment will be diverse, dynamic and unpredictable. This survey is an overview of this nascent field. We provide a unifying formalism and terminology for discussing different generalisation problems, building upon previous works. We go on to categorise existing benchmarks for generalisation, as well as current methods for tackling the generalisation problem. Finally, we provide a critical discussion of the current state of the field, including recommendations for future work. Among other conclusions, we argue that taking a purely procedural content generation approach to benchmark design is not conducive to progress in generalisation, we suggest fast online adaptation and tackling RL-specific problems as some areas for future work on methods for generalisation, and we recommend building benchmarks in underexplored problem settings such as offline RL generalisation and reward-function variation.

Related papers

Provable Zero-Shot Generalization in Offline Reinforcement Learning [55.169228792596805]
We study offline reinforcement learning with zero-shot generalization property (ZSG) Existing work showed that classical offline RL fails to generalize to new, unseen environments. We show that both PERM and PPPO are capable of finding a near-optimal policy with ZSG.
arXiv Detail & Related papers (2025-03-11T02:44:32Z)
Inferring Behavior-Specific Context Improves Zero-Shot Generalization in Reinforcement Learning [4.902544998453533]
We argue that understanding and utilizing contextual cues, such as the gravity level of the environment, is critical for robust generalization. Our algorithm demonstrates improved generalization on various simulated domains, outperforming prior context-learning techniques in zero-shot settings.
arXiv Detail & Related papers (2024-04-15T07:31:48Z)
Discovering General Reinforcement Learning Algorithms with Adversarial Environment Design [54.39859618450935]
We show that it is possible to meta-learn update rules, with the hope of discovering algorithms that can perform well on a wide range of RL tasks. Despite impressive initial results from algorithms such as Learned Policy Gradient (LPG), there remains a gap when these algorithms are applied to unseen environments. In this work, we examine how characteristics of the meta-supervised-training distribution impact the performance of these algorithms.
arXiv Detail & Related papers (2023-10-04T12:52:56Z)
RL-ViGen: A Reinforcement Learning Benchmark for Visual Generalization [23.417092819516185]
We introduce RL-ViGen: a novel Reinforcement Learning Benchmark for Visual Generalization. RL-ViGen contains diverse tasks and a wide spectrum of generalization types, thereby facilitating the derivation of more reliable conclusions. Our aspiration is that RL-ViGen will serve as a catalyst in the future creation of universal visual generalization RL agents.
arXiv Detail & Related papers (2023-07-15T05:45:37Z)
On the Importance of Exploration for Generalization in Reinforcement Learning [89.63074327328765]
We propose EDE: Exploration via Distributional Ensemble, a method that encourages exploration of states with high uncertainty. Our algorithm is the first value-based approach to achieve state-of-the-art on both Procgen and Crafter.
arXiv Detail & Related papers (2023-06-08T18:07:02Z)
A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL. We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task. We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z)
A Survey on Deep Reinforcement Learning-based Approaches for Adaptation and Generalization [3.307203784120634]
Deep Reinforcement Learning (DRL) aims to create intelligent agents that can learn to solve complex problems efficiently in a real-world environment. This paper presents a survey on the recent developments in DRL-based approaches for adaptation and generalization.
arXiv Detail & Related papers (2022-02-17T04:29:08Z)
Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner. We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z)
Policy Mirror Descent for Regularized Reinforcement Learning: A Generalized Framework with Linear Convergence [60.20076757208645]
This paper proposes a general policy mirror descent (GPMD) algorithm for solving regularized RL. We demonstrate that our algorithm converges linearly over an entire range learning rates, in a dimension-free fashion, to the global solution.
arXiv Detail & Related papers (2021-05-24T02:21:34Z)
Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents. We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks. This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.