A Game-Theoretic Perspective of Generalization in Reinforcement Learning
- URL: http://arxiv.org/abs/2208.03650v1
- Date: Sun, 7 Aug 2022 06:17:15 GMT
- Title: A Game-Theoretic Perspective of Generalization in Reinforcement Learning
- Authors: Chang Yang, Ruiyu Wang, Xinrun Wang, Zhen Wang
- Abstract summary: Generalization in reinforcement learning (RL) is of importance for real deployment of RL algorithms.
We propose a game-theoretic framework for the generalization in reinforcement learning, named GiRL.
- Score: 9.402272029807316
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalization in reinforcement learning (RL) is of importance for real
deployment of RL algorithms. Various schemes are proposed to address the
generalization issues, including transfer learning, multi-task learning and
meta learning, as well as the robust and adversarial reinforcement learning.
However, there is not a unified formulation of the various schemes, as well as
the comprehensive comparisons of methods across different schemes. In this
work, we propose a game-theoretic framework for the generalization in
reinforcement learning, named GiRL, where an RL agent is trained against an
adversary over a set of tasks, where the adversary can manipulate the
distributions over tasks within a given threshold. With different
configurations, GiRL can reduce the various schemes mentioned above. To solve
GiRL, we adapt the widely-used method in game theory, policy space response
oracle (PSRO) with the following three important modifications: i) we use
model-agnostic meta learning (MAML) as the best-response oracle, ii) we propose
a modified projected replicated dynamics, i.e., R-PRD, which ensures the
computed meta-strategy of the adversary fall in the threshold, and iii) we also
propose a protocol for the few-shot learning of the multiple strategies during
testing. Extensive experiments on MuJoCo environments demonstrate that our
proposed methods can outperform existing baselines, e.g., MAML.
Related papers
- Towards an Information Theoretic Framework of Context-Based Offline
Meta-Reinforcement Learning [50.976910714839065]
Context-based OMRL (COMRL) as a popular paradigm, aims to learn a universal policy conditioned on effective task representations.
We show that COMRL algorithms are essentially optimizing the same mutual information objective between the task variable $boldsymbolM$ and its latent representation $boldsymbolZ$ by implementing various approximate bounds.
Based on the theoretical insight and the information bottleneck principle, we arrive at a novel algorithm dubbed UNICORN, which exhibits remarkable generalization across a broad spectrum of RL benchmarks.
arXiv Detail & Related papers (2024-02-04T09:58:42Z) - The Role of Diverse Replay for Generalisation in Reinforcement Learning [7.399291598113285]
We investigate the impact of the exploration strategy and replay buffer on generalisation in reinforcement learning.
We show that collecting and training on more diverse data from the training environments will improve zero-shot generalisation to new tasks.
arXiv Detail & Related papers (2023-06-09T07:48:36Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Learning Meta Representations for Agents in Multi-Agent Reinforcement
Learning [12.170248966278281]
In multi-agent reinforcement learning, behaviors that agents learn in a single Markov Game (MG) are typically confined to the given agent number.
In this work, our focus is on creating agents that can generalize across population-varying MGs.
Instead of learning a unimodal policy, each agent learns a policy set comprising effective strategies across a variety of games.
arXiv Detail & Related papers (2021-08-30T04:30:53Z) - Variational Empowerment as Representation Learning for Goal-Based
Reinforcement Learning [114.07623388322048]
We discuss how the standard goal-conditioned RL (GCRL) is encapsulated by the objective variational empowerment.
Our work lays a novel foundation from which to evaluate, analyze, and develop representation learning techniques in goal-based RL.
arXiv Detail & Related papers (2021-06-02T18:12:26Z) - FOCAL: Efficient Fully-Offline Meta-Reinforcement Learning via Distance
Metric Learning and Behavior Regularization [10.243908145832394]
We study the offline meta-reinforcement learning (OMRL) problem, a paradigm which enables reinforcement learning (RL) algorithms to quickly adapt to unseen tasks.
This problem is still not fully understood, for which two major challenges need to be addressed.
We provide analysis and insight showing that some simple design choices can yield substantial improvements over recent approaches.
arXiv Detail & Related papers (2020-10-02T17:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.