URLB: Unsupervised Reinforcement Learning Benchmark
- URL: http://arxiv.org/abs/2110.15191v1
- Date: Thu, 28 Oct 2021 15:07:01 GMT
- Title: URLB: Unsupervised Reinforcement Learning Benchmark
- Authors: Michael Laskin, Denis Yarats, Hao Liu, Kimin Lee, Albert Zhan, Kevin
Lu, Catherine Cang, Lerrel Pinto, Pieter Abbeel
- Abstract summary: We introduce the Unsupervised Reinforcement Learning Benchmark (URLB)
URLB consists of two phases: reward-free pre-training and downstream task adaptation with extrinsic rewards.
We provide twelve continuous control tasks from three domains for evaluation and open-source code for eight leading unsupervised RL methods.
- Score: 82.36060735454647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep Reinforcement Learning (RL) has emerged as a powerful paradigm to solve
a range of complex yet specific control tasks. Yet training generalist agents
that can quickly adapt to new tasks remains an outstanding challenge. Recent
advances in unsupervised RL have shown that pre-training RL agents with
self-supervised intrinsic rewards can result in efficient adaptation. However,
these algorithms have been hard to compare and develop due to the lack of a
unified benchmark. To this end, we introduce the Unsupervised Reinforcement
Learning Benchmark (URLB). URLB consists of two phases: reward-free
pre-training and downstream task adaptation with extrinsic rewards. Building on
the DeepMind Control Suite, we provide twelve continuous control tasks from
three domains for evaluation and open-source code for eight leading
unsupervised RL methods. We find that the implemented baselines make progress
but are not able to solve URLB and propose directions for future research.
Related papers
- Continuous Control with Coarse-to-fine Reinforcement Learning [15.585706638252441]
We present a framework that trains RL agents to zoom-into a continuous action space in a coarse-to-fine manner.
We introduce a concrete, value-based algorithm within the framework called Coarse-to-fine Q-Network (CQN)
CQN robustly learns to solve real-world manipulation tasks within a few minutes of online training.
arXiv Detail & Related papers (2024-07-10T16:04:08Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Learning Progress Driven Multi-Agent Curriculum [18.239527837186216]
Curriculum reinforcement learning aims to speed up learning by gradually increasing the difficulty of a task.
We propose self-paced MARL (SPMARL) to prioritize tasks based on textitlearning progress instead of the episode return.
arXiv Detail & Related papers (2022-05-20T08:16:30Z) - Jump-Start Reinforcement Learning [68.82380421479675]
We present a meta algorithm that can use offline data, demonstrations, or a pre-existing policy to initialize an RL policy.
In particular, we propose Jump-Start Reinforcement Learning (JSRL), an algorithm that employs two policies to solve tasks.
We show via experiments that JSRL is able to significantly outperform existing imitation and reinforcement learning algorithms.
arXiv Detail & Related papers (2022-04-05T17:25:22Z) - Text Generation with Efficient (Soft) Q-Learning [91.47743595382758]
Reinforcement learning (RL) offers a more flexible solution by allowing users to plug in arbitrary task metrics as reward.
We introduce a new RL formulation for text generation from the soft Q-learning perspective.
We apply the approach to a wide range of tasks, including learning from noisy/negative examples, adversarial attacks, and prompt generation.
arXiv Detail & Related papers (2021-06-14T18:48:40Z) - Robust Deep Reinforcement Learning through Adversarial Loss [74.20501663956604]
Recent studies have shown that deep reinforcement learning agents are vulnerable to small adversarial perturbations on the agent's inputs.
We propose RADIAL-RL, a principled framework to train reinforcement learning agents with improved robustness against adversarial attacks.
arXiv Detail & Related papers (2020-08-05T07:49:42Z) - SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep
Reinforcement Learning [102.78958681141577]
We present SUNRISE, a simple unified ensemble method, which is compatible with various off-policy deep reinforcement learning algorithms.
SUNRISE integrates two key ingredients: (a) ensemble-based weighted Bellman backups, which re-weight target Q-values based on uncertainty estimates from a Q-ensemble, and (b) an inference method that selects actions using the highest upper-confidence bounds for efficient exploration.
arXiv Detail & Related papers (2020-07-09T17:08:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.