DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model
- URL: http://arxiv.org/abs/2510.08556v1
- Date: Thu, 09 Oct 2025 17:59:11 GMT
- Title: DexNDM: Closing the Reality Gap for Dexterous In-Hand Rotation via Joint-Wise Neural Dynamics Model
- Authors: Xueyi Liu, He Wang, Li Yi,
- Abstract summary: We develop a novel framework that enables a single policy, trained in simulation, to generalize to a wide variety of objects and conditions in the real world.<n>We show that a single policy successfully rotates challenging objects with complex shapes (e.g., animals), high aspect ratios (up to 5.33), and small sizes.
- Score: 22.46947045094797
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Achieving generalized in-hand object rotation remains a significant challenge in robotics, largely due to the difficulty of transferring policies from simulation to the real world. The complex, contact-rich dynamics of dexterous manipulation create a "reality gap" that has limited prior work to constrained scenarios involving simple geometries, limited object sizes and aspect ratios, constrained wrist poses, or customized hands. We address this sim-to-real challenge with a novel framework that enables a single policy, trained in simulation, to generalize to a wide variety of objects and conditions in the real world. The core of our method is a joint-wise dynamics model that learns to bridge the reality gap by effectively fitting limited amount of real-world collected data and then adapting the sim policy's actions accordingly. The model is highly data-efficient and generalizable across different whole-hand interaction distributions by factorizing dynamics across joints, compressing system-wide influences into low-dimensional variables, and learning each joint's evolution from its own dynamic profile, implicitly capturing these net effects. We pair this with a fully autonomous data collection strategy that gathers diverse, real-world interaction data with minimal human intervention. Our complete pipeline demonstrates unprecedented generality: a single policy successfully rotates challenging objects with complex shapes (e.g., animals), high aspect ratios (up to 5.33), and small sizes, all while handling diverse wrist orientations and rotation axes. Comprehensive real-world evaluations and a teleoperation application for complex tasks validate the effectiveness and robustness of our approach. Website: https://meowuu7.github.io/DexNDM/
Related papers
- DreamDojo: A Generalist Robot World Model from Large-Scale Human Videos [110.98100817695307]
We introduce DreamDojo, a foundation world model that learns diverse interactions and dexterous controls from 44k hours of egocentric human videos.<n>Our work enables several important applications based on generative world models, including live teleoperation, policy evaluation, and model-based planning.
arXiv Detail & Related papers (2026-02-06T18:49:43Z) - Coupled Local and Global World Models for Efficient First Order RL [10.305209288475817]
This paper introduces a method that bypasses simulators entirely, training RL policies inside world models learned from robots' interactions with real environments.<n>At its core, our approach enables policy training with large-scale diffusion models via a novel decoupled first-order gradient (FoG) method.<n>We demonstrate the efficacy of our method on the Push-T manipulation task, where it significantly outperforms PPO in sample efficiency.
arXiv Detail & Related papers (2026-02-05T21:57:41Z) - R2RGEN: Real-to-Real 3D Data Generation for Spatially Generalized Manipulation [74.41728218960465]
We propose a real-to-real 3D data generation framework (R2RGen) that directly augments the pointcloud observation-action pairs to generate real-world data.<n>R2RGen substantially enhances data efficiency on extensive experiments and demonstrates strong potential for scaling and application on mobile manipulation.
arXiv Detail & Related papers (2025-10-09T17:55:44Z) - ExoPredicator: Learning Abstract Models of Dynamic Worlds for Robot Planning [77.49815848173613]
We propose a framework for abstract world models that jointly learns symbolic state representations and causal processes for both endogenous actions and mechanisms.<n>Across five simulated tabletop robotics environments, the learned models enable fast planning that generalizes to held-out tasks with more objects and more complex goals, outperforming a range of baselines.
arXiv Detail & Related papers (2025-09-30T13:44:34Z) - Multi-Modal Manipulation via Multi-Modal Policy Consensus [62.49978559936122]
We propose a new approach to integrate diverse sensory modalities for robotic manipulation.<n>Our method factorizes the policy into a set of diffusion models, each specialized for a single representation.<n>We evaluate our approach on simulated manipulation tasks in RLBench, as well as real-world tasks such as occluded object picking, in-hand spoon reorientation, and puzzle insertion.
arXiv Detail & Related papers (2025-09-27T19:43:04Z) - Generalizable Domain Adaptation for Sim-and-Real Policy Co-Training [21.855770200309674]
We propose a unified sim-and-real co-training framework for learning generalizable manipulation policies.<n>We show it can leverage abundant simulation data to achieve up to a 30% improvement in the real-world success rate.
arXiv Detail & Related papers (2025-09-23T04:32:53Z) - SimGenHOI: Physically Realistic Whole-Body Humanoid-Object Interaction via Generative Modeling and Reinforcement Learning [6.255814224573073]
SimGenHOI is a unified framework that combines the strengths of generative modeling and reinforcement learning to produce controllable and physically plausible HOI.<n>Our HOI generative model, based on Diffusion Transformers (DiT), predicts a set of key actions conditioned on text prompts, object geometry, sparse object waypoints, and the initial humanoid pose.<n>To ensure physical realism, we design a contact-aware whole-body control policy trained with reinforcement learning, which tracks the generated motions while correcting artifacts such as penetration and foot sliding.
arXiv Detail & Related papers (2025-08-18T15:20:46Z) - DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation [16.863534382288705]
We propose a novel framework that enhances action learning by jointly predicting future states and adapting to dynamics variations based on historical trajectories.<n>DyWA achieves an average success rate of 68% in real-world experiments.
arXiv Detail & Related papers (2025-03-21T02:29:52Z) - Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids [56.892520712892804]
We introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three dexterous manipulation tasks.<n>We demonstrate high success rates on unseen objects and robust, adaptive policy behaviors.
arXiv Detail & Related papers (2025-02-27T18:59:52Z) - InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions [27.225777494300775]
We introduce InterMimic, a framework that enables a single policy to robustly learn from hours of imperfect MoCap data.<n>Our experiments demonstrate that InterMimic produces realistic and diverse interactions across multiple HOI datasets.
arXiv Detail & Related papers (2025-02-27T18:59:12Z) - Transferring Foundation Models for Generalizable Robotic Manipulation [82.12754319808197]
We propose a novel paradigm that effectively leverages language-reasoning segmentation mask generated by internet-scale foundation models.<n>Our approach can effectively and robustly perceive object pose and enable sample-efficient generalization learning.<n>Demos can be found in our submitted video, and more comprehensive ones can be found in link1 or link2.
arXiv Detail & Related papers (2023-06-09T07:22:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.