CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation
- URL: http://arxiv.org/abs/2602.15060v2
- Date: Fri, 20 Feb 2026 09:51:55 GMT
- Title: CLOT: Closed-Loop Global Motion Tracking for Whole-Body Humanoid Teleoperation
- Authors: Tengjie Zhu, Guanyu Cai, Yang Zhaohui, Guanzhu Ren, Haohui Xie, ZiRui Wang, Junsong Wu, Jingbo Wang, Xiaokang Yang, Yao Mu, Yichao Yan,
- Abstract summary: We present CLOT, a real-time whole-body humanoid teleoperation system that achieves closed-loop global motion tracking.<n>CLOT synchronizes operator and robot poses in a closed loop, enabling drift-free human-to-humanoid mimicry over long timehorizons.<n>We propose a data-driven randomization strategy that decouples observation trajectories from reward evaluation, enabling smooth and stable global corrections.
- Score: 54.7399209456857
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-horizon whole-body humanoid teleoperation remains challenging due to accumulated global pose drift, particularly on full-sized humanoids. Although recent learning-based tracking methods enable agile and coordinated motions, they typically operate in the robot's local frame and neglect global pose feedback, leading to drift and instability during extended execution. In this work, we present CLOT, a real-time whole-body humanoid teleoperation system that achieves closed-loop global motion tracking via high-frequency localization feedback. CLOT synchronizes operator and robot poses in a closed loop, enabling drift-free human-to-humanoid mimicry over long timehorizons. However, directly imposing global tracking rewards in reinforcement learning, often results in aggressive and brittle corrections. To address this, we propose a data-driven randomization strategy that decouples observation trajectories from reward evaluation, enabling smooth and stable global corrections. We further regularize the policy with an adversarial motion prior to suppress unnatural behaviors. To support CLOT, we collect 20 hours of carefully curated human motion data for training the humanoid teleoperation policy. We design a transformer-based policy and train it for over 1300 GPU hours. The policy is deployed on a full-sized humanoid with 31 DoF (excluding hands). Both simulation and real-world experiments verify high-dynamic motion, high-precision tracking, and strong robustness in sim-to-real humanoid teleoperation. Motion data, demos and code can be found in our website.
Related papers
- TextOp: Real-time Interactive Text-Driven Humanoid Robot Motion Generation and Control [62.93681680333618]
TextOp is a real-time text-driven humanoid motion generation and control framework.<n>It supports streaming language commands and on-the-fly instruction modification during execution.<n>By bridging interactive motion generation with robust whole-body control, TextOp unlocks free-form intent expression.
arXiv Detail & Related papers (2026-02-07T08:42:11Z) - From Language to Locomotion: Retargeting-free Humanoid Control via Motion Latent Guidance [55.31807046722006]
Existing language-guided humanoid pipelines are cumbersome and untrustworthy.<n>We present RoboGhost, a language-free framework that conditions humanoid policies on language-grounded motion latents.<n>We show that RoboGhost substantially reduces deployment latency, improves success rates and tracking precision, and produces smooth, semantically aligned humanoids.
arXiv Detail & Related papers (2025-10-16T17:57:47Z) - ResMimic: From General Motion Tracking to Humanoid Whole-body Loco-Manipulation via Residual Learning [59.64325421657381]
Humanoid whole-body loco-manipulation promises transformative capabilities for daily service and warehouse tasks.<n>We introduce ResMimic, a two-stage residual learning framework for precise and expressive humanoid control from human motion data.<n>Results show substantial gains in task success, training efficiency, and robustness over strong baselines.
arXiv Detail & Related papers (2025-10-06T17:47:02Z) - TWIST: Teleoperated Whole-Body Imitation System [28.597388162969057]
We present the Teleoperated Whole-Body Imitation System (TWIST), a system for humanoid teleoperation through whole-body motion imitation.<n>We develop a robust, adaptive, and responsive whole-body controller using a combination of reinforcement learning and behavior cloning.<n>TWIST enables real-world humanoid robots to achieve unprecedented, versatile, and coordinated whole-body motor skills.
arXiv Detail & Related papers (2025-05-05T17:59:03Z) - HOMIE: Humanoid Loco-Manipulation with Isomorphic Exoskeleton Cockpit [52.12750762494588]
This paper introduces HOMIE, a semi-autonomous teleoperation system.<n>It combines a reinforcement learning policy for body control mapped to a pedal, an isomorphic exoskeleton arm for arm control, and motion-sensing gloves for hand control.<n>The system is fully open-source, demos and code can be found in our https://homietele.org/.
arXiv Detail & Related papers (2025-02-18T16:33:38Z) - ExBody2: Advanced Expressive Humanoid Whole-Body Control [16.69009772546575]
We propose a method for producing whole-body tracking controllers that are trained on both human motion capture and simulated data.<n>We use a teacher policy to produce intermediate data that better conforms to the robot's kinematics.<n>We observed significant improvement of tracking performance after fine-tuning on a small amount of data.
arXiv Detail & Related papers (2024-12-17T18:59:51Z) - Universal Humanoid Motion Representations for Physics-Based Control [71.46142106079292]
We present a universal motion representation that encompasses a comprehensive range of motor skills for physics-based humanoid control.
We first learn a motion imitator that can imitate all of human motion from a large, unstructured motion dataset.
We then create our motion representation by distilling skills directly from the imitator.
arXiv Detail & Related papers (2023-10-06T20:48:43Z) - GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras [99.07219478953982]
We present an approach for 3D global human mesh recovery from monocular videos recorded with dynamic cameras.
We first propose a deep generative motion infiller, which autoregressively infills the body motions of occluded humans based on visible motions.
In contrast to prior work, our approach reconstructs human meshes in consistent global coordinates even with dynamic cameras.
arXiv Detail & Related papers (2021-12-02T18:59:54Z) - Residual Force Control for Agile Human Behavior Imitation and Extended
Motion Synthesis [32.22704734791378]
Reinforcement learning has shown great promise for realistic human behaviors by learning humanoid control policies from motion capture data.
It is still very challenging to reproduce sophisticated human skills like ballet dance, or to stably imitate long-term human behaviors with complex transitions.
We propose a novel approach, residual force control (RFC), that augments a humanoid control policy by adding external residual forces into the action space.
arXiv Detail & Related papers (2020-06-12T17:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.