Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning
- URL: http://arxiv.org/abs/2008.00614v1
- Date: Mon, 3 Aug 2020 02:24:20 GMT
- Title: Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning
- Authors: Xingyu Lu, Kimin Lee, Pieter Abbeel, Stas Tiomkin
- Abstract summary: We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
- Score: 90.93035276307239
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Despite the significant progress of deep reinforcement learning (RL) in
solving sequential decision making problems, RL agents often overfit to
training environments and struggle to adapt to new, unseen environments. This
prevents robust applications of RL in real world situations, where system
dynamics may deviate wildly from the training settings. In this work, our
primary contribution is to propose an information theoretic regularization
objective and an annealing-based optimization method to achieve better
generalization ability in RL agents. We demonstrate the extreme generalization
benefits of our approach in different domains ranging from maze navigation to
robotic tasks; for the first time, we show that agents can generalize to test
parameters more than 10 standard deviations away from the training parameter
distribution. This work provides a principled way to improve generalization in
RL by gradually removing information that is redundant for task-solving; it
opens doors for the systematic study of generalization from training to
extremely different testing settings, focusing on the established connections
between information theory and machine learning.
Related papers
- Supplementing Gradient-Based Reinforcement Learning with Simple
Evolutionary Ideas [4.873362301533824]
We present a simple, sample-efficient algorithm for introducing large but directed learning steps in reinforcement learning (RL)
The methodology uses a population of RL agents training with a common experience buffer, with occasional crossovers and mutations of the agents in order to search efficiently through the policy space.
arXiv Detail & Related papers (2023-05-10T09:46:53Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - Generalization Through the Lens of Learning Dynamics [11.009483845261958]
A machine learning (ML) system must learn to generalize to novel situations in order to yield accurate predictions at deployment.
The impressive generalization performance of deep neural networks has stymied theoreticians.
This thesis will study the learning dynamics of deep neural networks in both supervised and reinforcement learning tasks.
arXiv Detail & Related papers (2022-12-11T00:07:24Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training.
We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z) - A Survey on Deep Reinforcement Learning-based Approaches for Adaptation
and Generalization [3.307203784120634]
Deep Reinforcement Learning (DRL) aims to create intelligent agents that can learn to solve complex problems efficiently in a real-world environment.
This paper presents a survey on the recent developments in DRL-based approaches for adaptation and generalization.
arXiv Detail & Related papers (2022-02-17T04:29:08Z) - Contextualize Me -- The Case for Context in Reinforcement Learning [49.794253971446416]
Contextual Reinforcement Learning (cRL) provides a framework to model such changes in a principled manner.
We show how cRL contributes to improving zero-shot generalization in RL through meaningful benchmarks and structured reasoning about generalization tasks.
arXiv Detail & Related papers (2022-02-09T15:01:59Z) - Generalization of Reinforcement Learning with Policy-Aware Adversarial
Data Augmentation [32.70482982044965]
We propose a novel policy-aware adversarial data augmentation method to augment the standard policy learning method with automatically generated trajectory data.
We conduct experiments on a number of RL tasks to investigate the generalization performance of the proposed method.
The results show our method can generalize well with limited training diversity, and achieve the state-of-the-art generalization test performance.
arXiv Detail & Related papers (2021-06-29T17:21:59Z) - Transient Non-Stationarity and Generalisation in Deep Reinforcement
Learning [67.34810824996887]
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments.
We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents.
arXiv Detail & Related papers (2020-06-10T13:26:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.