Related papers: From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation

From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation

URL: http://arxiv.org/abs/2204.12490v1
Date: Tue, 26 Apr 2022 17:59:51 GMT
Title: From One Hand to Multiple Hands: Imitation Learning for Dexterous Manipulation from Single-Camera Teleoperation
Authors: Yuzhe Qin, Hao Su, Xiaolong Wang
Abstract summary: We introduce a novel single-camera teleoperation system to collect the 3D demonstrations efficiently with only an iPad and a computer. We construct a customized robot hand for each user in the physical simulator, which is a manipulator resembling the same kinematics structure and shape of the operator's hand. With imitation learning using our data, we show large improvement over baselines with multiple complex manipulation tasks.
Score: 26.738893736520364
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose to perform imitation learning for dexterous manipulation with multi-finger robot hand from human demonstrations, and transfer the policy to the real robot hand. We introduce a novel single-camera teleoperation system to collect the 3D demonstrations efficiently with only an iPad and a computer. One key contribution of our system is that we construct a customized robot hand for each user in the physical simulator, which is a manipulator resembling the same kinematics structure and shape of the operator's hand. This provides an intuitive interface and avoid unstable human-robot hand retargeting for data collection, leading to large-scale and high quality data. Once the data is collected, the customized robot hand trajectories can be converted to different specified robot hands (models that are manufactured) to generate training demonstrations. With imitation learning using our data, we show large improvement over baselines with multiple complex manipulation tasks. Importantly, we show our learned policy is significantly more robust when transferring to the real robot. More videos can be found in the https://yzqin.github.io/dex-teleop-imitation .

Related papers

VidBot: Learning Generalizable 3D Actions from In-the-Wild 2D Human Videos for Zero-Shot Robotic Manipulation [53.63540587160549]
VidBot is a framework enabling zero-shot robotic manipulation using learned 3D affordance from in-the-wild monocular RGB-only human videos. VidBot paves the way for leveraging everyday human videos to make robot learning more scalable.
arXiv Detail & Related papers (2025-03-10T10:04:58Z)
Learning to Transfer Human Hand Skills for Robot Manipulations [12.797862020095856]
We present a method for teaching dexterous manipulation tasks to robots from human hand motion demonstrations. Our approach learns a joint motion manifold that maps human hand movements, robot hand actions, and object movements in 3D, enabling us to infer one motion from others.
arXiv Detail & Related papers (2025-01-07T22:33:47Z)
Zero-Cost Whole-Body Teleoperation for Mobile Manipulation [8.71539730969424]
MoMa-Teleop is a novel teleoperation method that delegates the base motions to a reinforcement learning agent. We demonstrate that our approach results in a significant reduction in task completion time across a variety of robots and tasks.
arXiv Detail & Related papers (2024-09-23T15:09:45Z)
Open-TeleVision: Teleoperation with Immersive Active Visual Feedback [17.505318269362512]
Open-TeleVision allows operators to actively perceive the robot's surroundings in a stereoscopic manner. The system mirrors the operator's arm and hand movements on the robot, creating an immersive experience. We validate the effectiveness of our system by collecting data and training imitation learning policies on four long-horizon, precise tasks.
arXiv Detail & Related papers (2024-07-01T17:55:35Z)
Giving Robots a Hand: Learning Generalizable Manipulation with Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation. Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation. In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z)
Zero-Shot Robot Manipulation from Passive Human Videos [59.193076151832145]
We develop a framework for extracting agent-agnostic action representations from human videos. Our framework is based on predicting plausible human hand trajectories. We deploy the trained model zero-shot for physical robot manipulation tasks.
arXiv Detail & Related papers (2023-02-03T21:39:52Z)
HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration [57.045140028275036]
We show that manipulation skills can be transferred from a human to a robot through the use of micro-evolutionary reinforcement learning. We propose an algorithm for multi-dimensional evolution path searching that allows joint optimization of both the robot evolution path and the policy.
arXiv Detail & Related papers (2022-12-08T15:56:13Z)
Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies. The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z)
Robotic Telekinesis: Learning a Robotic Hand Imitator by Watching Humans on Youtube [24.530131506065164]
We build a system that enables any human to control a robot hand and arm, simply by demonstrating motions with their own hand. The robot observes the human operator via a single RGB camera and imitates their actions in real-time. We leverage this data to train a system that understands human hands and retargets a human video stream into a robot hand-arm trajectory that is smooth, swift, safe, and semantically similar to the guiding demonstration.
arXiv Detail & Related papers (2022-02-21T18:59:59Z)
DexMV: Imitation Learning for Dexterous Manipulation from Human Videos [11.470141313103465]
We propose a new platform and pipeline, DexMV, for imitation learning to bridge the gap between computer vision and robot learning. We design a platform with: (i) a simulation system for complex dexterous manipulation tasks with a multi-finger robot hand and (ii) a computer vision system to record large-scale demonstrations of a human hand conducting the same tasks. We show that the demonstrations can indeed improve robot learning by a large margin and solve the complex tasks which reinforcement learning alone cannot solve.
arXiv Detail & Related papers (2021-08-12T17:51:18Z)
Where is my hand? Deep hand segmentation for visual self-recognition in humanoid robots [129.46920552019247]
We propose the use of a Convolution Neural Network (CNN) to segment the robot hand from an image in an egocentric view. We fine-tuned the Mask-RCNN network for the specific task of segmenting the hand of the humanoid robot Vizzy.
arXiv Detail & Related papers (2021-02-09T10:34:32Z)
Learning Predictive Models From Observation and Interaction [137.77887825854768]
Learning predictive models from interaction with the world allows an agent, such as a robot, to learn about how the world works. However, learning a model that captures the dynamics of complex skills represents a major challenge. We propose a method to augment the training set with observational data of other agents, such as humans.
arXiv Detail & Related papers (2019-12-30T01:10:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.