Robo360: A 3D Omnispective Multi-Material Robotic Manipulation Dataset
- URL: http://arxiv.org/abs/2312.06686v1
- Date: Sat, 9 Dec 2023 09:12:03 GMT
- Title: Robo360: A 3D Omnispective Multi-Material Robotic Manipulation Dataset
- Authors: Litian Liang, Liuyu Bian, Caiwei Xiao, Jialin Zhang, Linghao Chen,
Isabella Liu, Fanbo Xiang, Zhiao Huang, Hao Su
- Abstract summary: Recent interest in leveraging 3D algorithms has led to advancements in robot perception and physical understanding.
We present Robo360, a dataset that features robotic manipulation with a dense view coverage.
We hope that Robo360 can open new research directions yet to be explored at the intersection of understanding the physical world in 3D and robot control.
- Score: 26.845899347446807
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Building robots that can automate labor-intensive tasks has long been the
core motivation behind the advancements in computer vision and the robotics
community. Recent interest in leveraging 3D algorithms, particularly neural
fields, has led to advancements in robot perception and physical understanding
in manipulation scenarios. However, the real world's complexity poses
significant challenges. To tackle these challenges, we present Robo360, a
dataset that features robotic manipulation with a dense view coverage, which
enables high-quality 3D neural representation learning, and a diverse set of
objects with various physical and optical properties and facilitates research
in various object manipulation and physical world modeling tasks. We confirm
the effectiveness of our dataset using existing dynamic NeRF and evaluate its
potential in learning multi-view policies. We hope that Robo360 can open new
research directions yet to be explored at the intersection of understanding the
physical world in 3D and robot control.
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots [25.650235551519952]
We present RoboCasa, a large-scale simulation framework for training generalist robots in everyday environments.
We provide thousands of 3D assets across over 150 object categories and dozens of interactable furniture and appliances.
Our experiments show a clear scaling trend in using synthetically generated robot data for large-scale imitation learning.
arXiv Detail & Related papers (2024-06-04T17:41:31Z) - ManiFoundation Model for General-Purpose Robotic Manipulation of Contact Synthesis with Arbitrary Objects and Robots [24.035706461949715]
There is a pressing need to develop a model that enables general-purpose robots to undertake a broad spectrum of manipulation tasks.
Our work introduces a comprehensive framework to develop a foundation model for general robotic manipulation.
Our model achieves average success rates of around 90%.
arXiv Detail & Related papers (2024-05-11T09:18:37Z) - Teaching Unknown Objects by Leveraging Human Gaze and Augmented Reality
in Human-Robot Interaction [3.1473798197405953]
This dissertation aims to teach a robot unknown objects in the context of Human-Robot Interaction (HRI)
The combination of eye tracking and Augmented Reality created a powerful synergy that empowered the human teacher to communicate with the robot.
The robot's object detection capabilities exhibited comparable performance to state-of-the-art object detectors trained on extensive datasets.
arXiv Detail & Related papers (2023-12-12T11:34:43Z) - RH20T: A Comprehensive Robotic Dataset for Learning Diverse Skills in
One-Shot [56.130215236125224]
A key challenge in robotic manipulation in open domains is how to acquire diverse and generalizable skills for robots.
Recent research in one-shot imitation learning has shown promise in transferring trained policies to new tasks based on demonstrations.
This paper aims to unlock the potential for an agent to generalize to hundreds of real-world skills with multi-modal perception.
arXiv Detail & Related papers (2023-07-02T15:33:31Z) - DexArt: Benchmarking Generalizable Dexterous Manipulation with
Articulated Objects [8.195608430584073]
We propose a new benchmark called DexArt, which involves Dexterous manipulation with Articulated objects in a physical simulator.
Our main focus is to evaluate the generalizability of the learned policy on unseen articulated objects.
We use Reinforcement Learning with 3D representation learning to achieve generalization.
arXiv Detail & Related papers (2023-05-09T18:30:58Z) - RT-1: Robotics Transformer for Real-World Control at Scale [98.09428483862165]
We present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties.
We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks.
arXiv Detail & Related papers (2022-12-13T18:55:15Z) - See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation [49.925499720323806]
We study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks.
We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor.
arXiv Detail & Related papers (2022-12-07T18:55:53Z) - 3D Neural Scene Representations for Visuomotor Control [78.79583457239836]
We learn models for dynamic 3D scenes purely from 2D visual observations.
A dynamics model, constructed over the learned representation space, enables visuomotor control for challenging manipulation tasks.
arXiv Detail & Related papers (2021-07-08T17:49:37Z) - Learning Generalizable Robotic Reward Functions from "In-The-Wild" Human
Videos [59.58105314783289]
Domain-agnostic Video Discriminator (DVD) learns multitask reward functions by training a discriminator to classify whether two videos are performing the same task.
DVD can generalize by virtue of learning from a small amount of robot data with a broad dataset of human videos.
DVD can be combined with visual model predictive control to solve robotic manipulation tasks on a real WidowX200 robot in an unseen environment from a single human demo.
arXiv Detail & Related papers (2021-03-31T05:25:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.