Visual Imitation Made Easy
- URL: http://arxiv.org/abs/2008.04899v1
- Date: Tue, 11 Aug 2020 17:58:50 GMT
- Title: Visual Imitation Made Easy
- Authors: Sarah Young, Dhiraj Gandhi, Shubham Tulsiani, Abhinav Gupta, Pieter
Abbeel, Lerrel Pinto
- Abstract summary: We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
- Score: 102.36509665008732
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual imitation learning provides a framework for learning complex
manipulation behaviors by leveraging human demonstrations. However, current
interfaces for imitation such as kinesthetic teaching or teleoperation
prohibitively restrict our ability to efficiently collect large-scale data in
the wild. Obtaining such diverse demonstration data is paramount for the
generalization of learned skills to novel scenarios. In this work, we present
an alternate interface for imitation that simplifies the data collection
process while allowing for easy transfer to robots. We use commercially
available reacher-grabber assistive tools both as a data collection device and
as the robot's end-effector. To extract action information from these visual
demonstrations, we use off-the-shelf Structure from Motion (SfM) techniques in
addition to training a finger detection network. We experimentally evaluate on
two challenging tasks: non-prehensile pushing and prehensile stacking, with
1000 diverse demonstrations for each task. For both tasks, we use standard
behavior cloning to learn executable policies from the previously collected
offline demonstrations. To improve learning performance, we employ a variety of
data augmentations and provide an extensive analysis of its effects. Finally,
we demonstrate the utility of our interface by evaluating on real robotic
scenarios with previously unseen objects and achieve a 87% success rate on
pushing and a 62% success rate on stacking. Robot videos are available at
https://dhiraj100892.github.io/Visual-Imitation-Made-Easy.
Related papers
- VITAL: Visual Teleoperation to Enhance Robot Learning through Human-in-the-Loop Corrections [10.49712834719005]
We propose a low-cost visual teleoperation system for bimanual manipulation tasks, called VITAL.
Our approach leverages affordable hardware and visual processing techniques to collect demonstrations.
We enhance the generalizability and robustness of the learned policies by utilizing both real and simulated environments.
arXiv Detail & Related papers (2024-07-30T23:29:47Z) - Any-point Trajectory Modeling for Policy Learning [64.23861308947852]
We introduce Any-point Trajectory Modeling (ATM) to predict future trajectories of arbitrary points within a video frame.
ATM outperforms strong video pre-training baselines by 80% on average.
We show effective transfer learning of manipulation skills from human videos and videos from a different robot morphology.
arXiv Detail & Related papers (2023-12-28T23:34:43Z) - Multi-dataset Training of Transformers for Robust Action Recognition [75.5695991766902]
We study the task of robust feature representations, aiming to generalize well on multiple datasets for action recognition.
Here, we propose a novel multi-dataset training paradigm, MultiTrain, with the design of two new loss terms, namely informative loss and projection loss.
We verify the effectiveness of our method on five challenging datasets, Kinetics-400, Kinetics-700, Moments-in-Time, Activitynet and Something-something-v2.
arXiv Detail & Related papers (2022-09-26T01:30:43Z) - Continual Learning from Demonstration of Robotics Skills [5.573543601558405]
Methods for teaching motion skills to robots focus on training for a single skill at a time.
We propose an approach for continual learning from demonstration using hypernetworks and neural ordinary differential equation solvers.
arXiv Detail & Related papers (2022-02-14T16:26:52Z) - Playful Interactions for Representation Learning [82.59215739257104]
We propose to use playful interactions in a self-supervised manner to learn visual representations for downstream tasks.
We collect 2 hours of playful data in 19 diverse environments and use self-predictive learning to extract visual representations.
Our representations generalize better than standard behavior cloning and can achieve similar performance with only half the number of required demonstrations.
arXiv Detail & Related papers (2021-07-19T17:54:48Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Learning Object Manipulation Skills via Approximate State Estimation
from Real Videos [47.958512470724926]
Humans are adept at learning new tasks by watching a few instructional videos.
On the other hand, robots that learn new actions either require a lot of effort through trial and error, or use expert demonstrations that are challenging to obtain.
In this paper, we explore a method that facilitates learning object manipulation skills directly from videos.
arXiv Detail & Related papers (2020-11-13T08:53:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.