Example-Driven Model-Based Reinforcement Learning for Solving
Long-Horizon Visuomotor Tasks
- URL: http://arxiv.org/abs/2109.10312v1
- Date: Tue, 21 Sep 2021 16:48:07 GMT
- Title: Example-Driven Model-Based Reinforcement Learning for Solving
Long-Horizon Visuomotor Tasks
- Authors: Bohan Wu, Suraj Nair, Li Fei-Fei, Chelsea Finn
- Abstract summary: We introduce EMBR, a model-based RL method for learning primitive skills that are suitable for completing long-horizon visuomotor tasks.
On a Franka Emika robot arm, we find that EMBR enables the robot to complete three long-horizon visuomotor tasks at 85% success rate.
- Score: 85.56153200251713
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In this paper, we study the problem of learning a repertoire of low-level
skills from raw images that can be sequenced to complete long-horizon
visuomotor tasks. Reinforcement learning (RL) is a promising approach for
acquiring short-horizon skills autonomously. However, the focus of RL
algorithms has largely been on the success of those individual skills, more so
than learning and grounding a large repertoire of skills that can be sequenced
to complete extended multi-stage tasks. The latter demands robustness and
persistence, as errors in skills can compound over time, and may require the
robot to have a number of primitive skills in its repertoire, rather than just
one. To this end, we introduce EMBR, a model-based RL method for learning
primitive skills that are suitable for completing long-horizon visuomotor
tasks. EMBR learns and plans using a learned model, critic, and success
classifier, where the success classifier serves both as a reward function for
RL and as a grounding mechanism to continuously detect if the robot should
retry a skill when unsuccessful or under perturbations. Further, the learned
model is task-agnostic and trained using data from all skills, enabling the
robot to efficiently learn a number of distinct primitives. These visuomotor
primitive skills and their associated pre- and post-conditions can then be
directly combined with off-the-shelf symbolic planners to complete long-horizon
tasks. On a Franka Emika robot arm, we find that EMBR enables the robot to
complete three long-horizon visuomotor tasks at 85% success rate, such as
organizing an office desk, a file cabinet, and drawers, which require
sequencing up to 12 skills, involve 14 unique learned primitives, and demand
generalization to novel objects.
Related papers
- Agentic Skill Discovery [19.5703917813767]
Language-conditioned robotic skills make it possible to apply the high-level reasoning of Large Language Models (LLMs) to low-level robotic control.
A remaining challenge is to acquire a diverse set of fundamental skills.
We introduce a novel framework for skill discovery that is entirely driven by LLMs.
arXiv Detail & Related papers (2024-05-23T19:44:03Z) - Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem.
We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z) - LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon
Manipulation [16.05029027561921]
Task and Motion Planning approaches excel at solving and generalizing across long-horizon tasks.
They assume predefined skill sets, which limits their real-world applications.
We propose an integrated task planning and skill learning framework named LEAGUE.
We show that the learned skills can be reused to accelerate learning in new tasks domains and transfer to a physical robot platform.
arXiv Detail & Related papers (2022-10-23T06:57:05Z) - Lifelong Robotic Reinforcement Learning by Retaining Experiences [61.79346922421323]
Many multi-task reinforcement learning efforts assume the robot can collect data from all tasks at all times.
In this work, we study a practical sequential multi-task RL problem motivated by the practical constraints of physical robotic systems.
We derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot's skill-set.
arXiv Detail & Related papers (2021-09-19T18:00:51Z) - Hierarchical Few-Shot Imitation with Skill Transition Models [66.81252581083199]
Few-shot Imitation with Skill Transition Models (FIST) is an algorithm that extracts skills from offline data and utilizes them to generalize to unseen tasks.
We show that FIST is capable of generalizing to new tasks and substantially outperforms prior baselines in navigation experiments.
arXiv Detail & Related papers (2021-07-19T15:56:01Z) - Discovering Generalizable Skills via Automated Generation of Diverse
Tasks [82.16392072211337]
We propose a method to discover generalizable skills via automated generation of a diverse set of tasks.
As opposed to prior work on unsupervised discovery of skills, our method pairs each skill with a unique task produced by a trainable task generator.
A task discriminator defined on the robot behaviors in the generated tasks is jointly trained to estimate the evidence lower bound of the diversity objective.
The learned skills can then be composed in a hierarchical reinforcement learning algorithm to solve unseen target tasks.
arXiv Detail & Related papers (2021-06-26T03:41:51Z) - MT-Opt: Continuous Multi-Task Robotic Reinforcement Learning at Scale [103.7609761511652]
We show how a large-scale collective robotic learning system can acquire a repertoire of behaviors simultaneously.
New tasks can be continuously instantiated from previously learned tasks.
We train and evaluate our system on a set of 12 real-world tasks with data collected from 7 robots.
arXiv Detail & Related papers (2021-04-16T16:38:02Z) - SKID RAW: Skill Discovery from Raw Trajectories [23.871402375721285]
It is desirable to only demonstrate full task executions instead of all individual skills.
We propose a novel approach that simultaneously learns to segment trajectories into reoccurring patterns.
The approach learns a skill conditioning that can be used to understand possible sequences of skills.
arXiv Detail & Related papers (2021-03-26T17:27:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.