RVT-2: Learning Precise Manipulation from Few Demonstrations
- URL: http://arxiv.org/abs/2406.08545v1
- Date: Wed, 12 Jun 2024 18:00:01 GMT
- Title: RVT-2: Learning Precise Manipulation from Few Demonstrations
- Authors: Ankit Goyal, Valts Blukis, Jie Xu, Yijie Guo, Yu-Wei Chao, Dieter Fox,
- Abstract summary: RVT-2 is a multitask 3D manipulation model that is 6X faster in training and 2X faster in inference than its predecessor RVT.
It achieves a new state-of-the-art on RLBench, improving the success rate from 65% to 82%.
RVT-2 is also effective in the real world, where it can learn tasks requiring high precision, like picking up and inserting plugs, with just 10 demonstrations.
- Score: 43.48649783097065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we study how to build a robotic system that can solve multiple 3D manipulation tasks given language instructions. To be useful in industrial and household domains, such a system should be capable of learning new tasks with few demonstrations and solving them precisely. Prior works, like PerAct and RVT, have studied this problem, however, they often struggle with tasks requiring high precision. We study how to make them more effective, precise, and fast. Using a combination of architectural and system-level improvements, we propose RVT-2, a multitask 3D manipulation model that is 6X faster in training and 2X faster in inference than its predecessor RVT. RVT-2 achieves a new state-of-the-art on RLBench, improving the success rate from 65% to 82%. RVT-2 is also effective in the real world, where it can learn tasks requiring high precision, like picking up and inserting plugs, with just 10 demonstrations. Visual results, code, and trained model are provided at: https://robotic-view-transformer-2.github.io/.
Related papers
- Offline Imitation Learning Through Graph Search and Retrieval [57.57306578140857]
Imitation learning is a powerful machine learning algorithm for a robot to acquire manipulation skills.
We propose GSR, a simple yet effective algorithm that learns from suboptimal demonstrations through Graph Search and Retrieval.
GSR can achieve a 10% to 30% higher success rate and over 30% higher proficiency compared to baselines.
arXiv Detail & Related papers (2024-07-22T06:12:21Z) - LLARVA: Vision-Action Instruction Tuning Enhances Robot Learning [50.99807031490589]
We introduce LLARVA, a model trained with a novel instruction tuning method to unify a range of robotic learning tasks, scenarios, and environments.
We generate 8.5M image-visual trace pairs from the Open X-Embodiment dataset in order to pre-train our model.
Experiments yield strong performance, demonstrating that LLARVA performs well compared to several contemporary baselines.
arXiv Detail & Related papers (2024-06-17T17:55:29Z) - Leveraging Locality to Boost Sample Efficiency in Robotic Manipulation [14.990771038350106]
SGRv2 is an imitation learning framework that enhances sample efficiency through improved visual and action representations.
SGRv2 excels in RLBench tasks with control using merely 5 demonstrations and surpasses the RVT baseline in 23 of 26 tasks.
arXiv Detail & Related papers (2024-06-15T12:27:35Z) - RVT: Robotic View Transformer for 3D Object Manipulation [46.25268237442356]
We propose RVT, a multi-view transformer for 3D manipulation that is both scalable and accurate.
A single RVT model works well across 18 RLBench tasks with 249 task variations, achieving 26% higher relative success than the existing state-of-the-art method (PerAct)
arXiv Detail & Related papers (2023-06-26T17:59:31Z) - VIMA: General Robot Manipulation with Multimodal Prompts [82.01214865117637]
We show that a wide spectrum of robot manipulation tasks can be expressed with multimodal prompts.
We develop a new simulation benchmark that consists of thousands of procedurally-generated tabletop tasks.
We design a transformer-based robot agent, VIMA, that processes these prompts and outputs motor actions autoregressively.
arXiv Detail & Related papers (2022-10-06T17:50:11Z) - Accelerating Robot Learning of Contact-Rich Manipulations: A Curriculum
Learning Study [4.045850174820418]
This paper presents a study for accelerating robot learning of contact-rich manipulation tasks based on Curriculum Learning combined with Domain Randomization (DR)
We tackle complex industrial assembly tasks with position-controlled robots, such as insertion tasks.
Results also show that even when training only in simulation with toy tasks, our method can learn policies that can be transferred to the real-world robot.
arXiv Detail & Related papers (2022-04-27T11:08:39Z) - R3M: A Universal Visual Representation for Robot Manipulation [91.55543664116209]
We study how visual representations pre-trained on diverse human video data can enable data-efficient learning of robotic manipulation tasks.
We find that R3M improves task success by over 20% compared to training from scratch and by over 10% compared to state-of-the-art visual representations like CLIP and MoCo.
arXiv Detail & Related papers (2022-03-23T17:55:09Z) - Transporters with Visual Foresight for Solving Unseen Rearrangement
Tasks [12.604533231243543]
Transporters with Visual Foresight (TVF) is able to achieve multi-task learning and zero-shot generalization to unseen tasks.
TVF is able to improve the performance of a state-of-the-art imitation learning method on both training and unseen tasks in simulation and real robot experiments.
arXiv Detail & Related papers (2022-02-22T09:35:09Z) - Assembly robots with optimized control stiffness through reinforcement
learning [3.4410212782758047]
We propose a methodology that uses reinforcement learning to achieve high performance in robots.
The proposed method ensures the online generation of stiffness matrices that help improve the performance of local trajectory optimization.
The effectiveness of the method was verified via experiments involving two contact-rich tasks.
arXiv Detail & Related papers (2020-02-27T15:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.