RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
- URL: http://arxiv.org/abs/2308.15975v2
- Date: Thu, 31 Aug 2023 15:29:44 GMT
- Title: RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
- Authors: Mel Vecerik and Carl Doersch and Yi Yang and Todor Davchev and Yusuf
Aytar and Guangyao Zhou and Raia Hadsell and Lourdes Agapito and Jon Scholz
- Abstract summary: Track-Any-Point (TAP) models isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration.
We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together.
- Score: 36.43143326197769
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For robots to be useful outside labs and specialized factories we need a way
to teach them new useful behaviors quickly. Current approaches lack either the
generality to onboard new tasks without task-specific engineering, or else lack
the data-efficiency to do so in an amount of time that enables practical use.
In this work we explore dense tracking as a representational vehicle to allow
faster and more general learning from demonstration. Our approach utilizes
Track-Any-Point (TAP) models to isolate the relevant motion in a demonstration,
and parameterize a low-level controller to reproduce this motion across changes
in the scene configuration. We show this results in robust robot policies that
can solve complex object-arrangement tasks such as shape-matching, stacking,
and even full path-following tasks such as applying glue and sticking objects
together, all from demonstrations that can be collected in minutes.
Related papers
- Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics.
Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features.
We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z) - DITTO: Demonstration Imitation by Trajectory Transformation [31.930923345163087]
In this work, we address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording.
We propose a two-stage process. In the first stage we extract the demonstration trajectory offline. This entails segmenting manipulated objects and determining their relative motion in relation to secondary objects such as containers.
In the online trajectory generation stage, we first re-detect all objects, then warp the demonstration trajectory to the current scene and execute it on the robot.
arXiv Detail & Related papers (2024-03-22T13:46:51Z) - Interactive Planning Using Large Language Models for Partially
Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks.
We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z) - Few-Shot In-Context Imitation Learning via Implicit Graph Alignment [15.215659641228655]
We formulate imitation learning as a conditional alignment problem between graph representations of objects.
We show that this conditioning allows for in-context learning, where a robot can perform a task on a set of new objects immediately after the demonstrations.
arXiv Detail & Related papers (2023-10-18T18:26:01Z) - Visuomotor Control in Multi-Object Scenes Using Object-Aware
Representations [25.33452947179541]
We show the effectiveness of object-aware representation learning techniques for robotic tasks.
Our model learns control policies in a sample-efficient manner and outperforms state-of-the-art object techniques.
arXiv Detail & Related papers (2022-05-12T19:48:11Z) - Bottom-Up Skill Discovery from Unsegmented Demonstrations for
Long-Horizon Robot Manipulation [55.31301153979621]
We tackle real-world long-horizon robot manipulation tasks through skill discovery.
We present a bottom-up approach to learning a library of reusable skills from unsegmented demonstrations.
Our method has shown superior performance over state-of-the-art imitation learning methods in multi-stage manipulation tasks.
arXiv Detail & Related papers (2021-09-28T16:18:54Z) - Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query.
Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories.
We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z) - Meta Adaptation using Importance Weighted Demonstrations [19.37671674146514]
In some cases, the distribution shifts, so much, that it is difficult for an agent to infer the new task.
We propose a novel algorithm to generalize on any related task by leveraging prior knowledge on a set of specific tasks.
We show experiments where the robot is trained from a diversity of environmental tasks and is also able to adapt to an unseen environment.
arXiv Detail & Related papers (2019-11-23T07:22:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.