Related papers: RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation

URL: http://arxiv.org/abs/2308.15975v2
Date: Thu, 31 Aug 2023 15:29:44 GMT
Title: RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation
Authors: Mel Vecerik and Carl Doersch and Yi Yang and Todor Davchev and Yusuf Aytar and Guangyao Zhou and Raia Hadsell and Lourdes Agapito and Jon Scholz
Abstract summary: Track-Any-Point (TAP) models isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together.
Score: 36.43143326197769
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: For robots to be useful outside labs and specialized factories we need a way to teach them new useful behaviors quickly. Current approaches lack either the generality to onboard new tasks without task-specific engineering, or else lack the data-efficiency to do so in an amount of time that enables practical use. In this work we explore dense tracking as a representational vehicle to allow faster and more general learning from demonstration. Our approach utilizes Track-Any-Point (TAP) models to isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together, all from demonstrations that can be collected in minutes.

Related papers

Subtask-Aware Visual Reward Learning from Segmented Demonstrations [97.80917991633248]
This paper introduces REDS: REward learning from Demonstration with Demonstrations, a novel reward learning framework. We train a dense reward function conditioned on video segments and their corresponding subtasks to ensure alignment with ground-truth reward signals. Our experiments show that REDS significantly outperforms baseline methods on complex robotic manipulation tasks in Meta-World.
arXiv Detail & Related papers (2025-02-28T01:25:37Z)
Dynamic Non-Prehensile Object Transport via Model-Predictive Reinforcement Learning [24.079032278280447]
We propose an approach that combines batch reinforcement learning (RL) with model-predictive control (MPC) We validate the proposed approach through extensive simulated and real-world experiments on a Franka Panda robot performing the robot waiter task.
arXiv Detail & Related papers (2024-11-27T03:33:42Z)
Keypoint Abstraction using Large Models for Object-Relative Imitation Learning [78.92043196054071]
Generalization to novel object configurations and instances across diverse tasks and environments is a critical challenge in robotics. Keypoint-based representations have been proven effective as a succinct representation for essential object capturing features. We propose KALM, a framework that leverages large pre-trained vision-language models to automatically generate task-relevant and cross-instance consistent keypoints.
arXiv Detail & Related papers (2024-10-30T17:37:31Z)
DITTO: Demonstration Imitation by Trajectory Transformation [31.930923345163087]
In this work, we address the problem of one-shot imitation from a single human demonstration, given by an RGB-D video recording. We propose a two-stage process. In the first stage we extract the demonstration trajectory offline. This entails segmenting manipulated objects and determining their relative motion in relation to secondary objects such as containers. In the online trajectory generation stage, we first re-detect all objects, then warp the demonstration trajectory to the current scene and execute it on the robot.
arXiv Detail & Related papers (2024-03-22T13:46:51Z)
Interactive Planning Using Large Language Models for Partially Observable Robotics Tasks [54.60571399091711]
Large Language Models (LLMs) have achieved impressive results in creating robotic agents for performing open vocabulary tasks. We present an interactive planning technique for partially observable tasks using LLMs.
arXiv Detail & Related papers (2023-12-11T22:54:44Z)
Few-Shot In-Context Imitation Learning via Implicit Graph Alignment [15.215659641228655]
We formulate imitation learning as a conditional alignment problem between graph representations of objects. We show that this conditioning allows for in-context learning, where a robot can perform a task on a set of new objects immediately after the demonstrations.
arXiv Detail & Related papers (2023-10-18T18:26:01Z)
Visuomotor Control in Multi-Object Scenes Using Object-Aware Representations [25.33452947179541]
We show the effectiveness of object-aware representation learning techniques for robotic tasks. Our model learns control policies in a sample-efficient manner and outperforms state-of-the-art object techniques.
arXiv Detail & Related papers (2022-05-12T19:48:11Z)
Bottom-Up Skill Discovery from Unsegmented Demonstrations for Long-Horizon Robot Manipulation [55.31301153979621]
We tackle real-world long-horizon robot manipulation tasks through skill discovery. We present a bottom-up approach to learning a library of reusable skills from unsegmented demonstrations. Our method has shown superior performance over state-of-the-art imitation learning methods in multi-stage manipulation tasks.
arXiv Detail & Related papers (2021-09-28T16:18:54Z)
Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z)
Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots. We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector. We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)
Meta Adaptation using Importance Weighted Demonstrations [19.37671674146514]
In some cases, the distribution shifts, so much, that it is difficult for an agent to infer the new task. We propose a novel algorithm to generalize on any related task by leveraging prior knowledge on a set of specific tasks. We show experiments where the robot is trained from a diversity of environmental tasks and is also able to adapt to an unseen environment.
arXiv Detail & Related papers (2019-11-23T07:22:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.