MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation
- URL: http://arxiv.org/abs/2512.00324v1
- Date: Sat, 29 Nov 2025 05:34:39 GMT
- Title: MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation
- Authors: Jinda Du, Jieji Ren, Qiaojun Yu, Ningbin Zhang, Yu Deng, Xingyu Wei, Yufei Liu, Guoying Gu, Xiangyang Zhu,
- Abstract summary: Existing data-collection pipelines suffer from inaccurate motion manipulation, low data-collection efficiency, and missing high-resolution tactile sensing.<n>We address this gap with MILE, a mechanically tele-operation and data-collection system co-designed from human hand to robotic hand.
- Score: 17.138615434309575
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Imitation learning provides a promising approach to dexterous hand manipulation, but its effectiveness is limited by the lack of large-scale, high-fidelity data. Existing data-collection pipelines suffer from inaccurate motion retargeting, low data-collection efficiency, and missing high-resolution fingertip tactile sensing. We address this gap with MILE, a mechanically isomorphic teleoperation and data-collection system co-designed from human hand to exoskeleton to robotic hand. The exoskeleton is anthropometrically derived from the human hand, and the robotic hand preserves one-to-one joint-position isomorphism, eliminating nonlinear retargeting and enabling precise, natural control. The exoskeleton achieves a multi-joint mean absolute angular error below one degree, while the robotic hand integrates compact fingertip visuotactile modules that provide high-resolution tactile observations. Built on this retargeting-free interface, we teleoperate complex, contact-rich in-hand manipulation and efficiently collect a multimodal dataset comprising high-resolution fingertip visuotactile signals, RGB-D images, and joint positions. The teleoperation pipeline achieves a mean success rate improvement of 64%. Incorporating fingertip tactile observations further increases the success rate by an average of 25% over the vision-only baseline, validating the fidelity and utility of the dataset. Further details are available at: https://sites.google.com/view/mile-system.
Related papers
- End-to-End Dexterous Arm-Hand VLA Policies via Shared Autonomy: VR Teleoperation Augmented by Autonomous Hand VLA Policy for Efficient Data Collection [10.217810309422232]
We propose a framework that divides control between macro and micro motions.<n>A human operator guides the robot's arm pose through intuitive VR teleoperation.<n>An autonomous DexGrasp-VLA policy handles fine-grained hand control using real-time tactile and visual feedback.
arXiv Detail & Related papers (2025-10-31T16:12:02Z) - Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning [82.63833405368159]
Existing end-to-end methods require training on large-scale datasets for specific hands.<n>We propose an eigengrasp-based, end-to-end framework for cross-embodiment grasp generation.
arXiv Detail & Related papers (2025-10-07T15:57:00Z) - Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration [58.4036440289082]
Hand-object motion-capture (MoCap) offer large-scale, contact-rich demonstrations and hold promise for dexterous robotic scopes.<n>We introduce Dexplore, a unified single-loop optimization that performs repositories and tracking to learn robot control policies directly from MoCap at scale.
arXiv Detail & Related papers (2025-09-11T17:59:07Z) - Grasp Like Humans: Learning Generalizable Multi-Fingered Grasping from Human Proprioceptive Sensorimotor Integration [26.351720551267846]
Tactile and kinesthetic perceptions are crucial for human dexterous manipulation, enabling reliable grasping of objects via sensorimotor integration.<n>We propose a novel glove-mediated tactile-kinematic perception-prediction framework for grasp skill transfer from human intuitive and natural operation to robotic execution based on imitation learning.
arXiv Detail & Related papers (2025-09-10T07:44:12Z) - emg2tendon: From sEMG Signals to Tendon Control in Musculoskeletal Hands [5.613626927694011]
Tendon-driven robotic hands offer unparalleled dexterity for manipulation tasks.<n>However, learning control policies for such systems presents unique challenges.<n>We introduce the first large-scale EMG-to-Tendon Control dataset for robotic hands.
arXiv Detail & Related papers (2025-07-29T12:49:57Z) - Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos [66.62109400603394]
We introduce Being-H0, a dexterous Vision-Language-Action model trained on large-scale human videos.<n>Our approach centers on physical instruction tuning, a novel training paradigm that combines large-scale VLA pretraining from human videos, physical space alignment for 3D reasoning, and post-training adaptation for robotic tasks.<n>We empirically show the excellence of Being-H0 in hand motion generation and instruction following, and it also scales well with model and data sizes.
arXiv Detail & Related papers (2025-07-21T13:19:09Z) - Body-Hand Modality Expertized Networks with Cross-attention for Fine-grained Skeleton Action Recognition [28.174638880324014]
BHaRNet is a novel framework that augments a typical body-expert model with a hand-expert model.<n>Our model jointly trains both streams with an ensemble loss that fosters cooperative specialization.<n>Inspired by MMNet, we also demonstrate the applicability of our approach to multi-modal tasks by leveraging RGB information.
arXiv Detail & Related papers (2025-03-19T07:54:52Z) - DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation [78.60543357822957]
Dexterous manipulation with contact-rich interactions is crucial for advanced robotics.<n>We introduce DexHandDiff, an interaction-aware diffusion planning framework for adaptive dexterous manipulation.<n>Our framework achieves an average of 70.7% success rate on goal adaptive dexterous tasks, highlighting its robustness and flexibility in contact-rich manipulation.
arXiv Detail & Related papers (2024-11-27T18:03:26Z) - Depth Restoration of Hand-Held Transparent Objects for Human-to-Robot Handover [5.329513275750882]
This paper presents a Hand-Aware Depth Restoration (HADR) method based on creating an implicit neural representation function from a single RGB-D image.
The proposed method utilizes hand posture as an important guidance to leverage semantic and geometric information of hand-object interaction.
We further develop a real-world human-to-robot handover system based on HADR, demonstrating its potential in human-robot interaction applications.
arXiv Detail & Related papers (2024-08-27T12:25:12Z) - Learning Visuotactile Skills with Two Multifingered Hands [80.99370364907278]
We explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data.
Our results mark a promising step forward in bimanual multifingered manipulation from visuotactile data.
arXiv Detail & Related papers (2024-04-25T17:59:41Z) - Joint-bone Fusion Graph Convolutional Network for Semi-supervised
Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder.
Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream.
The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.