Latent Plans for Task-Agnostic Offline Reinforcement Learning
- URL: http://arxiv.org/abs/2209.08959v1
- Date: Mon, 19 Sep 2022 12:27:15 GMT
- Title: Latent Plans for Task-Agnostic Offline Reinforcement Learning
- Authors: Erick Rosete-Beas, Oier Mees, Gabriel Kalweit, Joschka Boedecker,
Wolfram Burgard
- Abstract summary: We propose a novel hierarchical approach to learn task-agnostic long-horizon policies from high-dimensional camera observations.
We show that our formulation enables producing previously unseen combinations of skills to reach temporally extended goals by "stitching" together latent skills.
We even learn one multi-task visuomotor policy for 25 distinct manipulation tasks in the real world which outperforms both imitation learning and offline reinforcement learning techniques.
- Score: 32.938030244921755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Everyday tasks of long-horizon and comprising a sequence of multiple implicit
subtasks still impose a major challenge in offline robot control. While a
number of prior methods aimed to address this setting with variants of
imitation and offline reinforcement learning, the learned behavior is typically
narrow and often struggles to reach configurable long-horizon goals. As both
paradigms have complementary strengths and weaknesses, we propose a novel
hierarchical approach that combines the strengths of both methods to learn
task-agnostic long-horizon policies from high-dimensional camera observations.
Concretely, we combine a low-level policy that learns latent skills via
imitation learning and a high-level policy learned from offline reinforcement
learning for skill-chaining the latent behavior priors. Experiments in various
simulated and real robot control tasks show that our formulation enables
producing previously unseen combinations of skills to reach temporally extended
goals by "stitching" together latent skills through goal chaining with an
order-of-magnitude improvement in performance upon state-of-the-art baselines.
We even learn one multi-task visuomotor policy for 25 distinct manipulation
tasks in the real world which outperforms both imitation learning and offline
reinforcement learning techniques.
Related papers
- Robust Policy Learning via Offline Skill Diffusion [6.876580618014666]
We present a novel offline skill learning framework, DuSkill.
DuSkill employs a guided Diffusion model to generate versatile skills extended from the limited skills in datasets.
We show that DuSkill outperforms other skill-based imitation learning and RL algorithms for several long-horizon tasks.
arXiv Detail & Related papers (2024-03-01T02:00:44Z) - Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL.
We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z) - Learning Goal-Conditioned Policies Offline with Self-Supervised Reward
Shaping [94.89128390954572]
We propose a novel self-supervised learning phase on the pre-collected dataset to understand the structure and the dynamics of the model.
We evaluate our method on three continuous control tasks, and show that our model significantly outperforms existing approaches.
arXiv Detail & Related papers (2023-01-05T15:07:10Z) - Learning Options via Compression [62.55893046218824]
We propose a new objective that combines the maximum likelihood objective with a penalty on the description length of the skills.
Our objective learns skills that solve downstream tasks in fewer samples compared to skills learned from only maximizing likelihood.
arXiv Detail & Related papers (2022-12-08T22:34:59Z) - Versatile Skill Control via Self-supervised Adversarial Imitation of
Unlabeled Mixed Motions [19.626042478612572]
We propose a cooperative adversarial method for obtaining versatile policies with controllable skill sets from unlabeled datasets.
We show that by utilizing unsupervised skill discovery in the generative imitation learning framework, novel and useful skills emerge with successful task fulfillment.
Finally, the obtained versatile policies are tested on an agile quadruped robot called Solo 8 and present faithful replications of diverse skills encoded in the demonstrations.
arXiv Detail & Related papers (2022-09-16T12:49:04Z) - Learning Temporally Extended Skills in Continuous Domains as Symbolic
Actions for Planning [2.642698101441705]
Problems which require both long-horizon planning and continuous control capabilities pose significant challenges to existing reinforcement learning agents.
We introduce a novel hierarchical reinforcement learning agent which links temporally extended skills for continuous control with a forward model in a symbolic abstraction of the environment's state for planning.
arXiv Detail & Related papers (2022-07-11T17:13:10Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Skill-based Meta-Reinforcement Learning [65.31995608339962]
We devise a method that enables meta-learning on long-horizon, sparse-reward tasks.
Our core idea is to leverage prior experience extracted from offline datasets during meta-learning.
arXiv Detail & Related papers (2022-04-25T17:58:19Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.