Playful Interactions for Representation Learning
- URL: http://arxiv.org/abs/2107.09046v1
- Date: Mon, 19 Jul 2021 17:54:48 GMT
- Title: Playful Interactions for Representation Learning
- Authors: Sarah Young, Jyothish Pari, Pieter Abbeel, Lerrel Pinto
- Abstract summary: We propose to use playful interactions in a self-supervised manner to learn visual representations for downstream tasks.
We collect 2 hours of playful data in 19 diverse environments and use self-predictive learning to extract visual representations.
Our representations generalize better than standard behavior cloning and can achieve similar performance with only half the number of required demonstrations.
- Score: 82.59215739257104
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the key challenges in visual imitation learning is collecting large
amounts of expert demonstrations for a given task. While methods for collecting
human demonstrations are becoming easier with teleoperation methods and the use
of low-cost assistive tools, we often still require 100-1000 demonstrations for
every task to learn a visual representation and policy. To address this, we
turn to an alternate form of data that does not require task-specific
demonstrations -- play. Playing is a fundamental method children use to learn a
set of skills and behaviors and visual representations in early learning.
Importantly, play data is diverse, task-agnostic, and relatively cheap to
obtain. In this work, we propose to use playful interactions in a
self-supervised manner to learn visual representations for downstream tasks. We
collect 2 hours of playful data in 19 diverse environments and use
self-predictive learning to extract visual representations. Given these
representations, we train policies using imitation learning for two downstream
tasks: Pushing and Stacking. We demonstrate that our visual representations
generalize better than standard behavior cloning and can achieve similar
performance with only half the number of required demonstrations. Our
representations, which are trained from scratch, compare favorably against
ImageNet pretrained representations. Finally, we provide an experimental
analysis on the effects of different pretraining modes on downstream task
learning.
Related papers
- What Makes Pre-Trained Visual Representations Successful for Robust
Manipulation? [57.92924256181857]
We find that visual representations designed for manipulation and control tasks do not necessarily generalize under subtle changes in lighting and scene texture.
We find that emergent segmentation ability is a strong predictor of out-of-distribution generalization among ViT models.
arXiv Detail & Related papers (2023-11-03T18:09:08Z) - Self-Supervised Learning of Multi-Object Keypoints for Robotic
Manipulation [8.939008609565368]
In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning.
We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning approaches, and demonstrate its flexibility and effectiveness for sample-efficient policy learning.
arXiv Detail & Related papers (2022-05-17T13:15:07Z) - The Surprising Effectiveness of Representation Learning for Visual
Imitation [12.60653315718265]
We propose to decouple representation learning from behavior learning for visual imitation.
First, we learn a visual representation encoder from offline data using standard supervised and self-supervised learning methods.
We experimentally show that this simple decoupling improves the performance of visual imitation models on both offline demonstration datasets and real-robot door opening compared to prior work in visual imitation.
arXiv Detail & Related papers (2021-12-02T18:58:09Z) - Learning Multi-Stage Tasks with One Demonstration via Self-Replay [9.34061107793983]
We introduce a novel method to learn everyday-like multi-stage tasks from a single human demonstration.
Inspired by the recent Coarse-to-Fine Imitation Learning method, we model imitation learning as a learned object reaching phase followed by an open-loop replay of the demonstrator's actions.
We evaluate with real-world experiments on a set of everyday-like multi-stage tasks, which we show that our method can solve from a single demonstration.
arXiv Detail & Related papers (2021-11-14T20:57:52Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - Learning Object Manipulation Skills via Approximate State Estimation
from Real Videos [47.958512470724926]
Humans are adept at learning new tasks by watching a few instructional videos.
On the other hand, robots that learn new actions either require a lot of effort through trial and error, or use expert demonstrations that are challenging to obtain.
In this paper, we explore a method that facilitates learning object manipulation skills directly from videos.
arXiv Detail & Related papers (2020-11-13T08:53:47Z) - What Can You Learn from Your Muscles? Learning Visual Representation
from Human Interactions [50.435861435121915]
We use human interaction and attention cues to investigate whether we can learn better representations compared to visual-only representations.
Our experiments show that our "muscly-supervised" representation outperforms a visual-only state-of-the-art method MoCo.
arXiv Detail & Related papers (2020-10-16T17:46:53Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.