Related papers: MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation

MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation

URL: http://arxiv.org/abs/2512.00324v1
Date: Sat, 29 Nov 2025 05:34:39 GMT
Title: MILE: A Mechanically Isomorphic Exoskeleton Data Collection System with Fingertip Visuotactile Sensing for Dexterous Manipulation
Authors: Jinda Du, Jieji Ren, Qiaojun Yu, Ningbin Zhang, Yu Deng, Xingyu Wei, Yufei Liu, Guoying Gu, Xiangyang Zhu,
Abstract summary: Existing data-collection pipelines suffer from inaccurate motion manipulation, low data-collection efficiency, and missing high-resolution tactile sensing.<n>We address this gap with MILE, a mechanically tele-operation and data-collection system co-designed from human hand to robotic hand.
Score: 17.138615434309575
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Imitation learning provides a promising approach to dexterous hand manipulation, but its effectiveness is limited by the lack of large-scale, high-fidelity data. Existing data-collection pipelines suffer from inaccurate motion retargeting, low data-collection efficiency, and missing high-resolution fingertip tactile sensing. We address this gap with MILE, a mechanically isomorphic teleoperation and data-collection system co-designed from human hand to exoskeleton to robotic hand. The exoskeleton is anthropometrically derived from the human hand, and the robotic hand preserves one-to-one joint-position isomorphism, eliminating nonlinear retargeting and enabling precise, natural control. The exoskeleton achieves a multi-joint mean absolute angular error below one degree, while the robotic hand integrates compact fingertip visuotactile modules that provide high-resolution tactile observations. Built on this retargeting-free interface, we teleoperate complex, contact-rich in-hand manipulation and efficiently collect a multimodal dataset comprising high-resolution fingertip visuotactile signals, RGB-D images, and joint positions. The teleoperation pipeline achieves a mean success rate improvement of 64%. Incorporating fingertip tactile observations further increases the success rate by an average of 25% over the vision-only baseline, validating the fidelity and utility of the dataset. Further details are available at: https://sites.google.com/view/mile-system.

Related papers

End-to-End Dexterous Arm-Hand VLA Policies via Shared Autonomy: VR Teleoperation Augmented by Autonomous Hand VLA Policy for Efficient Data Collection [10.217810309422232]
We propose a framework that divides control between macro and micro motions.<n>A human operator guides the robot's arm pose through intuitive VR teleoperation.<n>An autonomous DexGrasp-VLA policy handles fine-grained hand control using real-time tactile and visual feedback.
arXiv Detail & Related papers (2025-10-31T16:12:02Z)
Cross-Embodiment Dexterous Hand Articulation Generation via Morphology-Aware Learning [82.63833405368159]
Existing end-to-end methods require training on large-scale datasets for specific hands.<n>We propose an eigengrasp-based, end-to-end framework for cross-embodiment grasp generation.
arXiv Detail & Related papers (2025-10-07T15:57:00Z)
Dexplore: Scalable Neural Control for Dexterous Manipulation from Reference-Scoped Exploration [58.4036440289082]
Hand-object motion-capture (MoCap) offer large-scale, contact-rich demonstrations and hold promise for dexterous robotic scopes.<n>We introduce Dexplore, a unified single-loop optimization that performs repositories and tracking to learn robot control policies directly from MoCap at scale.
arXiv Detail & Related papers (2025-09-11T17:59:07Z)
Grasp Like Humans: Learning Generalizable Multi-Fingered Grasping from Human Proprioceptive Sensorimotor Integration [26.351720551267846]
Tactile and kinesthetic perceptions are crucial for human dexterous manipulation, enabling reliable grasping of objects via sensorimotor integration.<n>We propose a novel glove-mediated tactile-kinematic perception-prediction framework for grasp skill transfer from human intuitive and natural operation to robotic execution based on imitation learning.
arXiv Detail & Related papers (2025-09-10T07:44:12Z)
emg2tendon: From sEMG Signals to Tendon Control in Musculoskeletal Hands [5.613626927694011]
Tendon-driven robotic hands offer unparalleled dexterity for manipulation tasks.<n>However, learning control policies for such systems presents unique challenges.<n>We introduce the first large-scale EMG-to-Tendon Control dataset for robotic hands.
arXiv Detail & Related papers (2025-07-29T12:49:57Z)
Being-H0: Vision-Language-Action Pretraining from Large-Scale Human Videos [66.62109400603394]
We introduce Being-H0, a dexterous Vision-Language-Action model trained on large-scale human videos.<n>Our approach centers on physical instruction tuning, a novel training paradigm that combines large-scale VLA pretraining from human videos, physical space alignment for 3D reasoning, and post-training adaptation for robotic tasks.<n>We empirically show the excellence of Being-H0 in hand motion generation and instruction following, and it also scales well with model and data sizes.
arXiv Detail & Related papers (2025-07-21T13:19:09Z)
Body-Hand Modality Expertized Networks with Cross-attention for Fine-grained Skeleton Action Recognition [28.174638880324014]
BHaRNet is a novel framework that augments a typical body-expert model with a hand-expert model.<n>Our model jointly trains both streams with an ensemble loss that fosters cooperative specialization.<n>Inspired by MMNet, we also demonstrate the applicability of our approach to multi-modal tasks by leveraging RGB information.
arXiv Detail & Related papers (2025-03-19T07:54:52Z)
DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation [78.60543357822957]
Dexterous manipulation with contact-rich interactions is crucial for advanced robotics.<n>We introduce DexHandDiff, an interaction-aware diffusion planning framework for adaptive dexterous manipulation.<n>Our framework achieves an average of 70.7% success rate on goal adaptive dexterous tasks, highlighting its robustness and flexibility in contact-rich manipulation.
arXiv Detail & Related papers (2024-11-27T18:03:26Z)
Depth Restoration of Hand-Held Transparent Objects for Human-to-Robot Handover [5.329513275750882]
This paper presents a Hand-Aware Depth Restoration (HADR) method based on creating an implicit neural representation function from a single RGB-D image. The proposed method utilizes hand posture as an important guidance to leverage semantic and geometric information of hand-object interaction. We further develop a real-world human-to-robot handover system based on HADR, demonstrating its potential in human-robot interaction applications.
arXiv Detail & Related papers (2024-08-27T12:25:12Z)
Learning Visuotactile Skills with Two Multifingered Hands [80.99370364907278]
We explore learning from human demonstrations using a bimanual system with multifingered hands and visuotactile data. Our results mark a promising step forward in bimanual multifingered manipulation from visuotactile data.
arXiv Detail & Related papers (2024-04-25T17:59:41Z)
Joint-bone Fusion Graph Convolutional Network for Semi-supervised Skeleton Action Recognition [65.78703941973183]
We propose a novel correlation-driven joint-bone fusion graph convolutional network (CD-JBF-GCN) as an encoder and use a pose prediction head as a decoder. Specifically, the CD-JBF-GC can explore the motion transmission between the joint stream and the bone stream. The pose prediction based auto-encoder in the self-supervised training stage allows the network to learn motion representation from unlabeled data.
arXiv Detail & Related papers (2022-02-08T16:03:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.