Scalable Multi-Task Imitation Learning with Autonomous Improvement
- URL: http://arxiv.org/abs/2003.02636v1
- Date: Tue, 25 Feb 2020 18:56:42 GMT
- Title: Scalable Multi-Task Imitation Learning with Autonomous Improvement
- Authors: Avi Singh, Eric Jang, Alexander Irpan, Daniel Kappler, Murtaza Dalal,
Sergey Levine, Mohi Khansari, Chelsea Finn
- Abstract summary: We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
- Score: 159.9406205002599
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While robot learning has demonstrated promising results for enabling robots
to automatically acquire new skills, a critical challenge in deploying
learning-based systems is scale: acquiring enough data for the robot to
effectively generalize broadly. Imitation learning, in particular, has remained
a stable and powerful approach for robot learning, but critically relies on
expert operators for data collection. In this work, we target this challenge,
aiming to build an imitation learning system that can continuously improve
through autonomous data collection, while simultaneously avoiding the explicit
use of reinforcement learning, to maintain the stability, simplicity, and
scalability of supervised imitation. To accomplish this, we cast the problem of
imitation with autonomous improvement into a multi-task setting. We utilize the
insight that, in a multi-task setting, a failed attempt at one task might
represent a successful attempt at another task. This allows us to leverage the
robot's own trials as demonstrations for tasks other than the one that the
robot actually attempted. Using an initial dataset of multi-task demonstration
data, the robot autonomously collects trials which are only sparsely labeled
with a binary indication of whether the trial accomplished any useful task or
not. We then embed the trials into a learned latent space of tasks, trained
using only the initial demonstration dataset, to draw similarities between
various trials, enabling the robot to achieve one-shot generalization to new
tasks. In contrast to prior imitation learning approaches, our method can
autonomously collect data with sparse supervision for continuous improvement,
and in contrast to reinforcement learning algorithms, our method can
effectively improve from sparse, task-agnostic reward signals.
Related papers
- Autonomous Improvement of Instruction Following Skills via Foundation Models [44.63552778566584]
Intelligent instruction-following robots capable of improving from autonomously collected experience have the potential to transform robot learning.
We propose a novel approach that allows instruction-following policies to improve from autonomously collected data without human supervision.
We carry out extensive experiments in the real world to demonstrate the effectiveness of our approach, and find that in a suite of unseen environments, the robot policy can be improved 2x with autonomously collected data.
arXiv Detail & Related papers (2024-07-30T08:26:44Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Continual Robot Learning using Self-Supervised Task Inference [19.635428830237842]
We propose a self-supervised task inference approach to continually learn new tasks.
We use a behavior-matching self-supervised learning objective to train a novel Task Inference Network (TINet)
A multi-task policy is built on top of the TINet and trained with reinforcement learning to optimize performance over tasks.
arXiv Detail & Related papers (2023-09-10T09:32:35Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Don't Start From Scratch: Leveraging Prior Data to Automate Robotic
Reinforcement Learning [70.70104870417784]
Reinforcement learning (RL) algorithms hold the promise of enabling autonomous skill acquisition for robotic systems.
In practice, real-world robotic RL typically requires time consuming data collection and frequent human intervention to reset the environment.
In this work, we study how these challenges can be tackled by effective utilization of diverse offline datasets collected from previously seen tasks.
arXiv Detail & Related papers (2022-07-11T08:31:22Z) - BC-Z: Zero-Shot Task Generalization with Robotic Imitation Learning [108.41464483878683]
We study the problem of enabling a vision-based robotic manipulation system to generalize to novel tasks.
We develop an interactive and flexible imitation learning system that can learn from both demonstrations and interventions.
When scaling data collection on a real robot to more than 100 distinct tasks, we find that this system can perform 24 unseen manipulation tasks with an average success rate of 44%.
arXiv Detail & Related papers (2022-02-04T07:30:48Z) - Lifelong Robotic Reinforcement Learning by Retaining Experiences [61.79346922421323]
Many multi-task reinforcement learning efforts assume the robot can collect data from all tasks at all times.
In this work, we study a practical sequential multi-task RL problem motivated by the practical constraints of physical robotic systems.
We derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot's skill-set.
arXiv Detail & Related papers (2021-09-19T18:00:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.