Related papers: TWIST: Teleoperated Whole-Body Imitation System

TWIST: Teleoperated Whole-Body Imitation System

URL: http://arxiv.org/abs/2505.02833v1
Date: Mon, 05 May 2025 17:59:03 GMT
Title: TWIST: Teleoperated Whole-Body Imitation System
Authors: Yanjie Ze, Zixuan Chen, João Pedro Araújo, Zi-ang Cao, Xue Bin Peng, Jiajun Wu, C. Karen Liu,
Abstract summary: We present the Teleoperated Whole-Body Imitation System (TWIST), a system for humanoid teleoperation through whole-body motion imitation.<n>We develop a robust, adaptive, and responsive whole-body controller using a combination of reinforcement learning and behavior cloning.<n>TWIST enables real-world humanoid robots to achieve unprecedented, versatile, and coordinated whole-body motor skills.
Score: 28.597388162969057
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Teleoperating humanoid robots in a whole-body manner marks a fundamental step toward developing general-purpose robotic intelligence, with human motion providing an ideal interface for controlling all degrees of freedom. Yet, most current humanoid teleoperation systems fall short of enabling coordinated whole-body behavior, typically limiting themselves to isolated locomotion or manipulation tasks. We present the Teleoperated Whole-Body Imitation System (TWIST), a system for humanoid teleoperation through whole-body motion imitation. We first generate reference motion clips by retargeting human motion capture data to the humanoid robot. We then develop a robust, adaptive, and responsive whole-body controller using a combination of reinforcement learning and behavior cloning (RL+BC). Through systematic analysis, we demonstrate how incorporating privileged future motion frames and real-world motion capture (MoCap) data improves tracking accuracy. TWIST enables real-world humanoid robots to achieve unprecedented, versatile, and coordinated whole-body motor skills--spanning whole-body manipulation, legged manipulation, locomotion, and expressive movement--using a single unified neural network controller. Our project website: https://humanoid-teleop.github.io

Related papers

Feel the Force: Contact-Driven Learning from Humans [52.36160086934298]
Controlling fine-grained forces during manipulation remains a core challenge in robotics.<n>We present FeelTheForce, a robot learning system that models human tactile behavior to learn force-sensitive manipulation.<n>Our approach grounds robust low-level force control in scalable human supervision, achieving a 77% success rate across 5 force-sensitive manipulation tasks.
arXiv Detail & Related papers (2025-06-02T17:57:52Z)
HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit [52.12750762494588]
This paper introduces HOMIE, a semi-autonomous teleoperation system.<n>It combines a reinforcement learning policy for body control mapped to a pedal, an isomorphic exoskeleton arm for arm control, and motion-sensing gloves for hand control.<n>The system is fully open-source, demos and code can be found in our https://homietele.org/.
arXiv Detail & Related papers (2025-02-18T16:33:38Z)
Learning from Massive Human Videos for Universal Humanoid Pose Control [46.417054298537195]
This paper introduces Humanoid-X, a large-scale dataset of over 20 million humanoid robot poses with corresponding text-based motion descriptions.<n>We train a large humanoid model, UH-1, which takes text instructions as input and outputs corresponding actions to control a humanoid robot.<n>Our scalable training approach leads to superior generalization in text-based humanoid control, marking a significant step toward adaptable, real-world-ready humanoid robots.
arXiv Detail & Related papers (2024-12-18T18:59:56Z)
Learning Multi-Modal Whole-Body Control for Real-World Humanoid Robots [13.229028132036321]
Masked Humanoid Controller (MHC) supports standing, walking, and mimicry of whole and partial-body motions.<n>MHC imitates partially masked motions from a library of behaviors spanning standing, walking, optimized reference trajectories, re-targeted video clips, and human motion capture data.<n>We demonstrate sim-to-real transfer on the real-world Digit V3 humanoid robot.
arXiv Detail & Related papers (2024-07-30T09:10:24Z)
HumanPlus: Humanoid Shadowing and Imitation from Humans [82.47551890765202]
We introduce a full-stack system for humanoids to learn motion and autonomous skills from human data. We first train a low-level policy in simulation via reinforcement learning using existing 40-hour human motion datasets. We then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously.
arXiv Detail & Related papers (2024-06-15T00:41:34Z)
Learning Human-to-Humanoid Real-Time Whole-Body Teleoperation [34.65637397405485]
We present Human to Humanoid (H2O), a framework that enables real-time whole-body teleoperation of a humanoid robot with only an RGB camera. We train a robust real-time humanoid motion imitator in simulation using these refined motions and transfer it to the real humanoid robot in a zero-shot manner. To the best of our knowledge, this is the first demonstration to achieve learning-based real-time whole-body humanoid teleoperation.
arXiv Detail & Related papers (2024-03-07T12:10:41Z)
Expressive Whole-Body Control for Humanoid Robots [20.132927075816742]
We learn a whole-body control policy on a human-sized robot to mimic human motions as realistic as possible. With training in simulation and Sim2Real transfer, our policy can control a humanoid robot to walk in different styles, shake hands with humans, and even dance with a human in the real world.
arXiv Detail & Related papers (2024-02-26T18:09:24Z)
Giving Robots a Hand: Learning Generalizable Manipulation with Eye-in-Hand Human Video Demonstrations [66.47064743686953]
Eye-in-hand cameras have shown promise in enabling greater sample efficiency and generalization in vision-based robotic manipulation. Videos of humans performing tasks, on the other hand, are much cheaper to collect since they eliminate the need for expertise in robotic teleoperation. In this work, we augment narrow robotic imitation datasets with broad unlabeled human video demonstrations to greatly enhance the generalization of eye-in-hand visuomotor policies.
arXiv Detail & Related papers (2023-07-12T07:04:53Z)
HERD: Continuous Human-to-Robot Evolution for Learning from Human Demonstration [57.045140028275036]
We show that manipulation skills can be transferred from a human to a robot through the use of micro-evolutionary reinforcement learning. We propose an algorithm for multi-dimensional evolution path searching that allows joint optimization of both the robot evolution path and the policy.
arXiv Detail & Related papers (2022-12-08T15:56:13Z)
Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process. We propose to generate smooth motions via an efficient model-predictive control framework. We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z)
Synthesis and Execution of Communicative Robotic Movements with Generative Adversarial Networks [59.098560311521034]
We focus on how to transfer on two different robotic platforms the same kinematics modulation that humans adopt when manipulating delicate objects. We choose to modulate the velocity profile adopted by the robots' end-effector, inspired by what humans do when transporting objects with different characteristics. We exploit a novel Generative Adversarial Network architecture, trained with human kinematics examples, to generalize over them and generate new and meaningful velocity profiles.
arXiv Detail & Related papers (2022-03-29T15:03:05Z)
Robotic Telekinesis: Learning a Robotic Hand Imitator by Watching Humans on Youtube [24.530131506065164]
We build a system that enables any human to control a robot hand and arm, simply by demonstrating motions with their own hand. The robot observes the human operator via a single RGB camera and imitates their actions in real-time. We leverage this data to train a system that understands human hands and retargets a human video stream into a robot hand-arm trajectory that is smooth, swift, safe, and semantically similar to the guiding demonstration.
arXiv Detail & Related papers (2022-02-21T18:59:59Z)

This list is automatically generated from the titles and abstracts of the papers in this site.