Learning Transferable Motor Skills with Hierarchical Latent Mixture
Policies
- URL: http://arxiv.org/abs/2112.05062v1
- Date: Thu, 9 Dec 2021 17:37:14 GMT
- Title: Learning Transferable Motor Skills with Hierarchical Latent Mixture
Policies
- Authors: Dushyant Rao, Fereshteh Sadeghi, Leonard Hasenclever, Markus
Wulfmeier, Martina Zambelli, Giulia Vezzani, Dhruva Tirumala, Yusuf Aytar,
Josh Merel, Nicolas Heess, and Raia Hadsell
- Abstract summary: We propose an approach to learn abstract motor skills from data using a hierarchical mixture latent variable model.
We demonstrate in manipulation domains that the method can effectively cluster offline data into distinct, executable behaviours.
- Score: 37.09286945259353
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: For robots operating in the real world, it is desirable to learn reusable
behaviours that can effectively be transferred and adapted to numerous tasks
and scenarios. We propose an approach to learn abstract motor skills from data
using a hierarchical mixture latent variable model. In contrast to existing
work, our method exploits a three-level hierarchy of both discrete and
continuous latent variables, to capture a set of high-level behaviours while
allowing for variance in how they are executed. We demonstrate in manipulation
domains that the method can effectively cluster offline data into distinct,
executable behaviours, while retaining the flexibility of a continuous latent
variable model. The resulting skills can be transferred and fine-tuned on new
tasks, unseen objects, and from state to vision-based policies, yielding better
sample efficiency and asymptotic performance compared to existing skill- and
imitation-based methods. We further analyse how and when the skills are most
beneficial: they encourage directed exploration to cover large regions of the
state space relevant to the task, making them most effective in challenging
sparse-reward settings.
Related papers
- Latent-Predictive Empowerment: Measuring Empowerment without a Simulator [56.53777237504011]
We present Latent-Predictive Empowerment (LPE), an algorithm that can compute empowerment in a more practical manner.
LPE learns large skillsets by maximizing an objective that is a principled replacement for the mutual information between skills and states.
arXiv Detail & Related papers (2024-10-15T00:41:18Z) - Conditional Neural Expert Processes for Learning Movement Primitives from Demonstration [1.9336815376402723]
Conditional Neural Expert Processes (CNEP) learns to assign demonstrations from different modes to distinct expert networks.
CNEP does not require supervision on which mode the trajectories belong to.
Our system is capable of on-the-fly adaptation to environmental changes via an online conditioning mechanism.
arXiv Detail & Related papers (2024-02-13T12:52:02Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Versatile Skill Control via Self-supervised Adversarial Imitation of
Unlabeled Mixed Motions [19.626042478612572]
We propose a cooperative adversarial method for obtaining versatile policies with controllable skill sets from unlabeled datasets.
We show that by utilizing unsupervised skill discovery in the generative imitation learning framework, novel and useful skills emerge with successful task fulfillment.
Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations.
arXiv Detail & Related papers (2022-09-16T12:49:04Z) - Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning [120.38381203153159]
Reinforcement learning can train policies that effectively perform complex tasks.
For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills.
We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
arXiv Detail & Related papers (2021-11-04T22:46:16Z) - TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model.
Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning.
To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Probabilistic Active Meta-Learning [15.432006404678981]
We introduce task selection based on prior experience into a meta-learning algorithm.
We provide empirical evidence that our approach improves data-efficiency when compared to strong baselines on simulated robotic experiments.
arXiv Detail & Related papers (2020-07-17T12:51:42Z) - Task-Agnostic Online Reinforcement Learning with an Infinite Mixture of
Gaussian Processes [25.513074215377696]
This paper proposes a continual online model-based reinforcement learning approach.
It does not require pre-training to solve task-agnostic problems with unknown task boundaries.
In experiments, our approach outperforms alternative methods in non-stationary tasks.
arXiv Detail & Related papers (2020-06-19T23:52:45Z) - Meta-Reinforcement Learning Robust to Distributional Shift via Model
Identification and Experience Relabeling [126.69933134648541]
We present a meta-reinforcement learning algorithm that is both efficient and extrapolates well when faced with out-of-distribution tasks at test time.
Our method is based on a simple insight: we recognize that dynamics models can be adapted efficiently and consistently with off-policy data.
arXiv Detail & Related papers (2020-06-12T13:34:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.