Hierarchical Skills for Efficient Exploration
- URL: http://arxiv.org/abs/2110.10809v1
- Date: Wed, 20 Oct 2021 22:29:32 GMT
- Title: Hierarchical Skills for Efficient Exploration
- Authors: Jonas Gehring, Gabriel Synnaeve, Andreas Krause, Nicolas Usunier
- Abstract summary: In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
- Score: 70.62309286348057
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In reinforcement learning, pre-trained low-level skills have the potential to
greatly facilitate exploration. However, prior knowledge of the downstream task
is required to strike the right balance between generality (fine-grained
control) and specificity (faster learning) in skill design. In previous work on
continuous control, the sensitivity of methods to this trade-off has not been
addressed explicitly, as locomotion provides a suitable prior for navigation
tasks, which have been of foremost interest. In this work, we analyze this
trade-off for low-level policy pre-training with a new benchmark suite of
diverse, sparse-reward tasks for bipedal robots. We alleviate the need for
prior knowledge by proposing a hierarchical skill learning framework that
acquires skills of varying complexity in an unsupervised manner. For
utilization on downstream tasks, we present a three-layered hierarchical
learning algorithm to automatically trade off between general and specific
skills as required by the respective task. In our experiments, we show that our
approach performs this trade-off effectively and achieves better results than
current state-of-the-art methods for end- to-end hierarchical reinforcement
learning and unsupervised skill discovery. Code and videos are available at
https://facebookresearch.github.io/hsd3 .
Related papers
- Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Residual Skill Policies: Learning an Adaptable Skill-based Action Space
for Reinforcement Learning for Robotics [18.546688182454236]
Skill-based reinforcement learning (RL) has emerged as a promising strategy to leverage prior knowledge for accelerated robot learning.
We propose accelerating exploration in the skill space using state-conditioned generative models.
We validate our approach across four challenging manipulation tasks, demonstrating our ability to learn across task variations.
arXiv Detail & Related papers (2022-11-04T02:42:17Z) - Learning and Retrieval from Prior Data for Skill-based Imitation
Learning [47.59794569496233]
We develop a skill-based imitation learning framework that extracts temporally extended sensorimotor skills from prior data.
We identify several key design choices that significantly improve performance on novel tasks.
arXiv Detail & Related papers (2022-10-20T17:34:59Z) - Unsupervised Reinforcement Learning for Transferable Manipulation Skill
Discovery [22.32327908453603]
Current reinforcement learning (RL) in robotics often experiences difficulty in generalizing to new downstream tasks.
We propose a framework that pre-trains the agent in a task-agnostic manner without access to the task-specific reward.
We show that our approach achieves the most diverse interacting behavior and significantly improves sample efficiency in downstream tasks.
arXiv Detail & Related papers (2022-04-29T06:57:46Z) - Divide & Conquer Imitation Learning [75.31752559017978]
Imitation Learning can be a powerful approach to bootstrap the learning process.
We present a novel algorithm designed to imitate complex robotic tasks from the states of an expert trajectory.
We show that our method imitates a non-holonomic navigation task and scales to a complex simulated robotic manipulation task with very high sample efficiency.
arXiv Detail & Related papers (2022-04-15T09:56:50Z) - Possibility Before Utility: Learning And Using Hierarchical Affordances [21.556661319375255]
Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures.
We present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning.
arXiv Detail & Related papers (2022-03-23T19:17:22Z) - Hierarchical Few-Shot Imitation with Skill Transition Models [66.81252581083199]
Few-shot Imitation with Skill Transition Models (FIST) is an algorithm that extracts skills from offline data and utilizes them to generalize to unseen tasks.
We show that FIST is capable of generalizing to new tasks and substantially outperforms prior baselines in navigation experiments.
arXiv Detail & Related papers (2021-07-19T15:56:01Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z) - Pre-trained Word Embeddings for Goal-conditional Transfer Learning in
Reinforcement Learning [0.0]
We show how a pre-trained task-independent language model can make a goal-conditional RL agent more sample efficient.
We do this by facilitating transfer learning between different related tasks.
arXiv Detail & Related papers (2020-07-10T06:42:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.