Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
- URL: http://arxiv.org/abs/2407.01512v2
- Date: Mon, 8 Jul 2024 16:59:38 GMT
- Title: Open-TeleVision: Teleoperation with Immersive Active Visual Feedback
- Authors: Xuxin Cheng, Jialong Li, Shiqi Yang, Ge Yang, Xiaolong Wang,
- Abstract summary: Open-TeleVision allows operators to actively perceive the robot's surroundings in a stereoscopic manner.
The system mirrors the operator's arm and hand movements on the robot, creating an immersive experience.
We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks.
- Score: 17.505318269362512
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Teleoperation serves as a powerful method for collecting on-robot data essential for robot learning from demonstrations. The intuitiveness and ease of use of the teleoperation system are crucial for ensuring high-quality, diverse, and scalable data. To achieve this, we propose an immersive teleoperation system Open-TeleVision that allows operators to actively perceive the robot's surroundings in a stereoscopic manner. Additionally, the system mirrors the operator's arm and hand movements on the robot, creating an immersive experience as if the operator's mind is transmitted to a robot embodiment. We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks (Can Sorting, Can Insertion, Folding, and Unloading) for 2 different humanoid robots and deploy them in the real world. The system is open-sourced at: https://robot-tv.github.io/
Related papers
- Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition [48.65867987106428]
We introduce a novel system for joint learning between human operators and robots.
It enables human operators to share control of a robot end-effector with a learned assistive agent.
It reduces the need for human adaptation while ensuring the collected data is of sufficient quality for downstream tasks.
arXiv Detail & Related papers (2024-06-29T03:37:29Z) - ManiWAV: Learning Robot Manipulation from In-the-Wild Audio-Visual Data [28.36623343236893]
We introduce ManiWAV: an 'ear-in-hand' data collection device to collect in-the-wild human demonstrations with synchronous audio and visual feedback.
We show that our system can generalize to unseen in-the-wild environments by learning from diverse in-the-wild human demonstrations.
arXiv Detail & Related papers (2024-06-27T18:06:38Z) - Giving Robots a Hand: Learning Generalizable Manipulation with
Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation.
Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation.
In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z) - AnyTeleop: A General Vision-Based Dexterous Robot Arm-Hand Teleoperation System [51.48191418148764]
Vision-based teleoperation can endow robots with human-level intelligence to interact with the environment.
Current vision-based teleoperation systems are designed and engineered towards a particular robot model and deploy environment.
We propose AnyTeleop, a unified and general teleoperation system to support multiple different arms, hands, realities, and camera configurations within a single system.
arXiv Detail & Related papers (2023-07-10T14:11:07Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - See, Hear, and Feel: Smart Sensory Fusion for Robotic Manipulation [49.925499720323806]
We study how visual, auditory, and tactile perception can jointly help robots to solve complex manipulation tasks.
We build a robot system that can see with a camera, hear with a contact microphone, and feel with a vision-based tactile sensor.
arXiv Detail & Related papers (2022-12-07T18:55:53Z) - From One Hand to Multiple Hands: Imitation Learning for Dexterous
Manipulation from Single-Camera Teleoperation [26.738893736520364]
We introduce a novel single-camera teleoperation system to collect the 3D demonstrations efficiently with only an iPad and a computer.
We construct a customized robot hand for each user in the physical simulator, which is a manipulator resembling the same kinematics structure and shape of the operator's hand.
With imitation learning using our data, we show large improvement over baselines with multiple complex manipulation tasks.
arXiv Detail & Related papers (2022-04-26T17:59:51Z) - OpenBot: Turning Smartphones into Robots [95.94432031144716]
Current robots are either expensive or make significant compromises on sensory richness, computational power, and communication capabilities.
We propose to leverage smartphones to equip robots with extensive sensor suites, powerful computational abilities, state-of-the-art communication channels, and access to a thriving software ecosystem.
We design a small electric vehicle that costs $50 and serves as a robot body for standard Android smartphones.
arXiv Detail & Related papers (2020-08-24T18:04:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.