Recurrent Hypernetworks are Surprisingly Strong in Meta-RL
- URL: http://arxiv.org/abs/2309.14970v4
- Date: Tue, 26 Dec 2023 11:19:10 GMT
- Title: Recurrent Hypernetworks are Surprisingly Strong in Meta-RL
- Authors: Jacob Beck, Risto Vuorio, Zheng Xiong, Shimon Whiteson
- Abstract summary: Deep reinforcement learning (RL) is notoriously impractical to deploy due to sample inefficiency.
Meta-RL directly addresses this sample inefficiency by learning to perform few-shot learning when a distribution of related tasks is available for meta-training.
Recent work suggests end-to-end learning in conjunction with an off-the-shelf sequential model, such as a recurrent network, is a surprisingly strong baseline.
- Score: 37.80510757630612
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep reinforcement learning (RL) is notoriously impractical to deploy due to
sample inefficiency. Meta-RL directly addresses this sample inefficiency by
learning to perform few-shot learning when a distribution of related tasks is
available for meta-training. While many specialized meta-RL methods have been
proposed, recent work suggests that end-to-end learning in conjunction with an
off-the-shelf sequential model, such as a recurrent network, is a surprisingly
strong baseline. However, such claims have been controversial due to limited
supporting evidence, particularly in the face of prior work establishing
precisely the opposite. In this paper, we conduct an empirical investigation.
While we likewise find that a recurrent network can achieve strong performance,
we demonstrate that the use of hypernetworks is crucial to maximizing their
potential. Surprisingly, when combined with hypernetworks, the recurrent
baselines that are far simpler than existing specialized methods actually
achieve the strongest performance of all methods evaluated. We provide code at
https://github.com/jacooba/hyper.
Related papers
- RL$^3$: Boosting Meta Reinforcement Learning via RL inside RL$^2$ [12.111848705677142]
We propose RL$3$, a hybrid approach that incorporates action-values, learned per task through traditional RL, in the inputs to meta-RL.
We show that RL$3$ earns greater cumulative reward in the long term, compared to RL$2$, while maintaining data-efficiency in the short term, and generalizes better to out-of-distribution tasks.
arXiv Detail & Related papers (2023-06-28T04:16:16Z) - A Survey of Meta-Reinforcement Learning [69.76165430793571]
We cast the development of better RL algorithms as a machine learning problem itself in a process called meta-RL.
We discuss how, at a high level, meta-RL research can be clustered based on the presence of a task distribution and the learning budget available for each individual task.
We conclude by presenting the open problems on the path to making meta-RL part of the standard toolbox for a deep RL practitioner.
arXiv Detail & Related papers (2023-01-19T12:01:41Z) - Hypernetworks in Meta-Reinforcement Learning [47.25270748922176]
Multi-task reinforcement learning (RL) and meta-RL aim to improve sample efficiency by generalizing over a distribution of related tasks.
State of the art methods often fail to outperform a degenerate solution that simply learns each task separately.
Hypernetworks are a promising path forward since they replicate the separate policies of the degenerate solution and are applicable to meta-RL.
arXiv Detail & Related papers (2022-10-20T15:34:52Z) - Meta Reinforcement Learning with Successor Feature Based Context [51.35452583759734]
We propose a novel meta-RL approach that achieves competitive performance comparing to existing meta-RL algorithms.
Our method does not only learn high-quality policies for multiple tasks simultaneously but also can quickly adapt to new tasks with a small amount of training.
arXiv Detail & Related papers (2022-07-29T14:52:47Z) - An Empirical Study of Implicit Regularization in Deep Offline RL [44.62587507925864]
We study the relation between effective rank and performance on three offline RL datasets.
We identify three phases of learning that explain the impact of implicit regularization on the learning dynamics.
arXiv Detail & Related papers (2022-07-05T15:07:31Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - Instabilities of Offline RL with Pre-Trained Neural Representation [127.89397629569808]
In offline reinforcement learning (RL), we seek to utilize offline data to evaluate (or learn) policies in scenarios where the data are collected from a distribution that substantially differs from that of the target policy to be evaluated.
Recent theoretical advances have shown that such sample-efficient offline RL is indeed possible provided certain strong representational conditions hold.
This work studies these issues from an empirical perspective to gauge how stable offline RL methods are.
arXiv Detail & Related papers (2021-03-08T18:06:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.