Learning to Play by Imitating Humans
- URL: http://arxiv.org/abs/2006.06874v1
- Date: Thu, 11 Jun 2020 23:28:54 GMT
- Title: Learning to Play by Imitating Humans
- Authors: Rostam Dinyari and Pierre Sermanet and Corey Lynch
- Abstract summary: We show that it is possible to acquire a diverse set of skills by self-supervising control on top of human teleoperated play data.
By training a behavioral cloning policy on a relatively small quantity of human play, we autonomously generate a large quantity of cloned play data.
We demonstrate that a general purpose goal-conditioned policy trained on this augmented dataset substantially outperforms one trained only with the original human data.
- Score: 8.209859328381269
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Acquiring multiple skills has commonly involved collecting a large number of
expert demonstrations per task or engineering custom reward functions. Recently
it has been shown that it is possible to acquire a diverse set of skills by
self-supervising control on top of human teleoperated play data. Play is rich
in state space coverage and a policy trained on this data can generalize to
specific tasks at test time outperforming policies trained on individual expert
task demonstrations. In this work, we explore the question of whether robots
can learn to play to autonomously generate play data that can ultimately
enhance performance. By training a behavioral cloning policy on a relatively
small quantity of human play, we autonomously generate a large quantity of
cloned play data that can be used as additional training. We demonstrate that a
general purpose goal-conditioned policy trained on this augmented dataset
substantially outperforms one trained only with the original human data on 18
difficult user-specified manipulation tasks in a simulated robotic tabletop
environment. A video example of a robot imitating human play can be seen here:
https://learning-to-play.github.io/videos/undirected_play1.mp4
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.