Related papers: Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration

Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration

URL: http://arxiv.org/abs/2509.09671v1
Date: Thu, 11 Sep 2025 17:59:07 GMT
Title: Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration
Authors: Sirui Xu, Yu-Wei Chao, Liuyu Bian, Arsalan Mousavian, Yu-Xiong Wang, Liang-Yan Gui, Wei Yang,
Abstract summary: Hand-object motion-capture (MoCap) offer large-scale, contact-rich demonstrations and hold promise for dexterous robotic scopes.<n>We introduce Dexplore, a unified single-loop optimization that performs repositories and tracking to learn robot control policies directly from MoCap at scale.
Score: 58.4036440289082
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hand-object motion-capture (MoCap) repositories offer large-scale, contact-rich demonstrations and hold promise for scaling dexterous robotic manipulation. Yet demonstration inaccuracies and embodiment gaps between human and robot hands limit the straightforward use of these data. Existing methods adopt a three-stage workflow, including retargeting, tracking, and residual correction, which often leaves demonstrations underused and compound errors across stages. We introduce Dexplore, a unified single-loop optimization that jointly performs retargeting and tracking to learn robot control policies directly from MoCap at scale. Rather than treating demonstrations as ground truth, we use them as soft guidance. From raw trajectories, we derive adaptive spatial scopes, and train with reinforcement learning to keep the policy in-scope while minimizing control effort and accomplishing the task. This unified formulation preserves demonstration intent, enables robot-specific strategies to emerge, improves robustness to noise, and scales to large demonstration corpora. We distill the scaled tracking policy into a vision-based, skill-conditioned generative controller that encodes diverse manipulation skills in a rich latent representation, supporting generalization across objects and real-world deployment. Taken together, these contributions position Dexplore as a principled bridge that transforms imperfect demonstrations into effective training signals for dexterous manipulation.

Related papers

ULTRA: Unified Multimodal Control for Autonomous Humanoid Whole-Body Loco-Manipulation [55.467742403416175]
We introduce a physics-driven neural algorithm that translates large-scale motion capture to humanoid embodiments.<n>We learn a unified multimodal controller that supports both dense references and sparse task specifications.<n>Results show that ULTRA generalizes to autonomous, goal-conditioned whole-body loco-manipulation from egocentric perception.
arXiv Detail & Related papers (2026-03-03T18:59:29Z)
Imitating What Works: Simulation-Filtered Modular Policy Learning from Human Videos [56.510263910611684]
We tackle prehensile manipulation, in which tasks involve grasping an object before performing various post-grasp motions.<n>Human videos offer strong signals for learning the post-grasp motions, but they are less useful for learning the prerequisite grasping behaviors.<n>We present Perceive-Simulate-Imitate (PSI), a framework for training a modular manipulation policy using human video motion data.
arXiv Detail & Related papers (2026-02-13T18:59:10Z)
METIS: Multi-Source Egocentric Training for Integrated Dexterous Vision-Language-Action Model [36.82365894983052]
A major bottleneck lies in the scarcity of large-scale, action-annotated data for dexterous skills.<n>We propose METIS, a vision-language-action model for dexterous manipulation pretrained on egocentric datasets.<n>Our method demonstrates exceptional dexterous manipulation capabilities, achieving highest average success rate in six real-world tasks.
arXiv Detail & Related papers (2025-11-21T16:32:36Z)
Self-Augmented Robot Trajectory: Efficient Imitation Learning via Safe Self-augmentation with Demonstrator-annotated Precision [2.3548641190233264]
Self-Augmented Robot Trajectory (SART) is a framework that enables policy learning from a single human demonstration.<n>SART achieves substantially higher success rates than policies trained solely on human-collected demonstrations.
arXiv Detail & Related papers (2025-09-11T23:10:56Z)
CLAM: Continuous Latent Action Models for Robot Learning from Unlabeled Demonstrations [11.604546089466734]
Learning robot policies using imitation learning requires collecting large amounts of costly action-labeled expert demonstrations.<n>A promising approach is to harness the abundance of unlabeled observations-e.g., from video demonstrations-to learn latent action labels in an unsupervised way.<n>We design continuous latent action models (CLAM) which incorporate two key ingredients we find necessary for learning to solve complex continuous control tasks from unlabeled observation data.
arXiv Detail & Related papers (2025-05-08T07:07:58Z)
Imitation Learning with Limited Actions via Diffusion Planners and Deep Koopman Controllers [23.292429025366417]
We propose a plan-then-control framework aimed at improving the action-data efficiency of inverse dynamics controllers.<n>Specifically, we adopt a Deep Koopman Operator framework to model the dynamical system and utilize observation-only trajectories to learn a latent action representation.<n>This latent representation can then be effectively mapped to real high-dimensional continuous actions using a linear action decoder.
arXiv Detail & Related papers (2024-10-10T03:33:57Z)
Any-point Trajectory Modeling for Policy Learning [64.23861308947852]
We introduce Any-point Trajectory Modeling (ATM) to predict future trajectories of arbitrary points within a video frame. ATM outperforms strong video pre-training baselines by 80% on average. We show effective transfer learning of manipulation skills from human videos and videos from a different robot morphology.
arXiv Detail & Related papers (2023-12-28T23:34:43Z)
Nonprehensile Planar Manipulation through Reinforcement Learning with Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers. We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies. We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z)
Learning to Shift Attention for Motion Generation [55.61994201686024]
One challenge of motion generation using robot learning from demonstration techniques is that human demonstrations follow a distribution with multiple modes for one task query. Previous approaches fail to capture all modes or tend to average modes of the demonstrations and thus generate invalid trajectories. We propose a motion generation model with extrapolation ability to overcome this problem.
arXiv Detail & Related papers (2021-02-24T09:07:52Z)
Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings. We develop an algorithm to train the policy iteratively on new data collected by the system. We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z)
Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots. We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector. We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.