RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in
One-Shot
- URL: http://arxiv.org/abs/2307.00595v2
- Date: Tue, 26 Sep 2023 10:47:35 GMT
- Title: RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in
One-Shot
- Authors: Hao-Shu Fang, Hongjie Fang, Zhenyu Tang, Jirong Liu, Chenxi Wang,
Junbo Wang, Haoyi Zhu, Cewu Lu
- Abstract summary: A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots.
Recent research in one-shot imitation learning has shown promise in transferring trained policies to new tasks based on demonstrations.
This paper aims to unlock the potential for an agent to generalize to hundreds of real-world skills with multi-modal perception.
- Score: 56.130215236125224
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: A key challenge in robotic manipulation in open domains is how to acquire
diverse and generalizable skills for robots. Recent research in one-shot
imitation learning has shown promise in transferring trained policies to new
tasks based on demonstrations. This feature is attractive for enabling robots
to acquire new skills and improving task and motion planning. However, due to
limitations in the training dataset, the current focus of the community has
mainly been on simple cases, such as push or pick-place tasks, relying solely
on visual guidance. In reality, there are many complex skills, some of which
may even require both visual and tactile perception to solve. This paper aims
to unlock the potential for an agent to generalize to hundreds of real-world
skills with multi-modal perception. To achieve this, we have collected a
dataset comprising over 110,000 contact-rich robot manipulation sequences
across diverse skills, contexts, robots, and camera viewpoints, all collected
in the real world. Each sequence in the dataset includes visual, force, audio,
and action information. Moreover, we also provide a corresponding human
demonstration video and a language description for each robot sequence. We have
invested significant efforts in calibrating all the sensors and ensuring a
high-quality dataset. The dataset is made publicly available at rh20t.github.io
Related papers
- VITAL: Visual Teleoperation to Enhance Robot Learning through Human-in-the-Loop Corrections [10.49712834719005]
We propose a low-cost visual teleoperation system for bimanual manipulation tasks, called VITAL.
Our approach leverages affordable hardware and visual processing techniques to collect demonstrations.
We enhance the generalizability and robustness of the learned policies by utilizing both real and simulated environments.
arXiv Detail & Related papers (2024-07-30T23:29:47Z) - Towards Generalizable Zero-Shot Manipulation via Translating Human
Interaction Plans [58.27029676638521]
We show how passive human videos can serve as a rich source of data for learning such generalist robots.
We learn a human plan predictor that, given a current image of a scene and a goal image, predicts the future hand and object configurations.
We show that our learned system can perform over 16 manipulation skills that generalize to 40 objects.
arXiv Detail & Related papers (2023-12-01T18:54:12Z) - Scaling Robot Learning with Semantically Imagined Experience [21.361979238427722]
Recent advances in robot learning have shown promise in enabling robots to perform manipulation tasks.
One of the key contributing factors to this progress is the scale of robot data used to train the models.
We propose an alternative route and leverage text-to-image foundation models widely used in computer vision and natural language processing.
arXiv Detail & Related papers (2023-02-22T18:47:51Z) - RT-1: Robotics Transformer for Real-World Control at Scale [98.09428483862165]
We present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties.
We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks.
arXiv Detail & Related papers (2022-12-13T18:55:15Z) - Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies.
The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z) - Actionable Models: Unsupervised Offline Reinforcement Learning of
Robotic Skills [93.12417203541948]
We propose the objective of learning a functional understanding of the environment by learning to reach any goal state in a given dataset.
We find that our method can operate on high-dimensional camera images and learn a variety of skills on real robots that generalize to previously unseen scenes and objects.
arXiv Detail & Related papers (2021-04-15T20:10:11Z) - Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human
Videos [59.58105314783289]
Domain-agnostic Video Discriminator (DVD) learns multitask reward functions by training a discriminator to classify whether two videos are performing the same task.
DVD can generalize by virtue of learning from a small amount of robot data with a broad dataset of human videos.
DVD can be combined with visual model predictive control to solve robotic manipulation tasks on a real WidowX200 robot in an unseen environment from a single human demo.
arXiv Detail & Related papers (2021-03-31T05:25:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.