Related papers: RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches

RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches

URL: http://arxiv.org/abs/2311.01977v2
Date: Mon, 6 Nov 2023 05:53:08 GMT
Title: RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches
Authors: Jiayuan Gu, Sean Kirmani, Paul Wohlhart, Yao Lu, Montserrat Gonzalez Arenas, Kanishka Rao, Wenhao Yu, Chuyuan Fu, Keerthana Gopalakrishnan, Zhuo Xu, Priya Sundaresan, Peng Xu, Hao Su, Karol Hausman, Chelsea Finn, Quan Vuong, Ted Xiao
Abstract summary: Generalization remains one of the most important desiderata for robust robot learning systems. We propose a policy conditioning method using rough trajectory sketches. We show that RT-Trajectory is able to perform a wider range of tasks compared to language-conditioned and goal-conditioned policies.
Score: 74.300116260004
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generalization remains one of the most important desiderata for robust robot learning systems. While recently proposed approaches show promise in generalization to novel objects, semantic concepts, or visual distribution shifts, generalization to new tasks remains challenging. For example, a language-conditioned policy trained on pick-and-place tasks will not be able to generalize to a folding task, even if the arm trajectory of folding is similar to pick-and-place. Our key insight is that this kind of generalization becomes feasible if we represent the task through rough trajectory sketches. We propose a policy conditioning method using such rough trajectory sketches, which we call RT-Trajectory, that is practical, easy to specify, and allows the policy to effectively perform new tasks that would otherwise be challenging to perform. We find that trajectory sketches strike a balance between being detailed enough to express low-level motion-centric guidance while being coarse enough to allow the learned policy to interpret the trajectory sketch in the context of situational visual observations. In addition, we show how trajectory sketches can provide a useful interface to communicate with robotic policies: they can be specified through simple human inputs like drawings or videos, or through automated methods such as modern image-generating or waypoint-generating methods. We evaluate RT-Trajectory at scale on a variety of real-world robotic tasks, and find that RT-Trajectory is able to perform a wider range of tasks compared to language-conditioned and goal-conditioned policies, when provided the same training data.

Related papers

Semantic-Geometric-Physical-Driven Robot Manipulation Skill Transfer via Skill Library and Tactile Representation [6.324290412766366]
skill library framework based on knowledge graphs endows robots with high-level skill awareness and spatial semantic understanding. At the motion level, an adaptive trajectory transfer method is developed using the A* algorithm and the skill library. At the physical level, we introduce an adaptive contour extraction and posture perception method based on tactile perception.
arXiv Detail & Related papers (2024-11-18T16:42:07Z)
RT-Affordance: Affordances are Versatile Intermediate Representations for Robot Manipulation [52.14638923430338]
We propose conditioning policies on affordances, which capture the pose of the robot at key stages of the task. Our method, RT-Affordance, is a hierarchical model that first proposes an affordance plan given the task language. We show on a diverse set of novel tasks how RT-Affordance exceeds the performance of existing methods by over 50%.
arXiv Detail & Related papers (2024-11-05T01:02:51Z)
Flex: End-to-End Text-Instructed Visual Navigation with Foundation Models [59.892436892964376]
We investigate the minimal data requirements and architectural adaptations necessary to achieve robust closed-loop performance with vision-based control policies. Our findings are synthesized in Flex (Fly-lexically), a framework that uses pre-trained Vision Language Models (VLMs) as frozen patch-wise feature extractors. We demonstrate the effectiveness of this approach on quadrotor fly-to-target tasks, where agents trained via behavior cloning successfully generalize to real-world scenes.
arXiv Detail & Related papers (2024-10-16T19:59:31Z)
Policy Adaptation via Language Optimization: Decomposing Tasks for Few-Shot Imitation [49.43094200366251]
We propose a novel approach for few-shot adaptation to unseen tasks that exploits the semantic understanding of task decomposition. Our method, Policy Adaptation via Language Optimization (PALO), combines a handful of demonstrations of a task with proposed language decompositions. We find that PALO is able of consistently complete long-horizon, multi-tier tasks in the real world, outperforming state of the art pre-trained generalist policies.
arXiv Detail & Related papers (2024-08-29T03:03:35Z)
Interactive Task Planning with Language Models [89.5839216871244]
An interactive robot framework accomplishes long-horizon task planning and can easily generalize to new goals and distinct tasks, even during execution. Recent large language model based approaches can allow for more open-ended planning but often require heavy prompt engineering or domain specific pretrained models. We propose a simple framework that achieves interactive task planning with language models by incorporating both high-level planning and low-level skill execution.
arXiv Detail & Related papers (2023-10-16T17:59:12Z)
RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation [36.43143326197769]
Track-Any-Point (TAP) models isolate the relevant motion in a demonstration, and parameterize a low-level controller to reproduce this motion across changes in the scene configuration. We show this results in robust robot policies that can solve complex object-arrangement tasks such as shape-matching, stacking, and even full path-following tasks such as applying glue and sticking objects together.
arXiv Detail & Related papers (2023-08-30T11:57:04Z)
Planning Immediate Landmarks of Targets for Model-Free Skill Transfer across Agents [34.56191646231944]
We propose PILoT, i.e., Planning Immediate Landmarks of Targets. PILoT learns a goal-conditioned state planner and distills a goal-planner to plan immediate landmarks in a model-free style. We show the power of PILoT on various transferring challenges, including few-shot transferring across action spaces and dynamics.
arXiv Detail & Related papers (2022-12-18T08:03:21Z)
Abstract-to-Executable Trajectory Translation for One-Shot Task Generalization [21.709054087028946]
We propose to achieve one-shot task generalization by decoupling plan generation and plan execution. Our method solves complex long-horizon tasks in three steps: build a paired abstract environment, generate abstract trajectories, and solve the original task by an abstract-to-executable trajectory translator.
arXiv Detail & Related papers (2022-10-14T09:17:34Z)
Generalization with Lossy Affordances: Leveraging Broad Offline Data for Learning Visuomotor Tasks [65.23947618404046]
We introduce a framework that acquires goal-conditioned policies for unseen temporally extended tasks via offline reinforcement learning on broad data. When faced with a novel task goal, the framework uses an affordance model to plan a sequence of lossy representations as subgoals that decomposes the original task into easier problems. We show that our framework can be pre-trained on large-scale datasets of robot experiences from prior work and efficiently fine-tuned for novel tasks, entirely from visual inputs without any manual reward engineering.
arXiv Detail & Related papers (2022-10-12T21:46:38Z)
Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.