Interaction-Aware 4D Gaussian Splatting for Dynamic Hand-Object Interaction Reconstruction
- URL: http://arxiv.org/abs/2511.14540v1
- Date: Tue, 18 Nov 2025 14:44:04 GMT
- Title: Interaction-Aware 4D Gaussian Splatting for Dynamic Hand-Object Interaction Reconstruction
- Authors: Hao Tian, Chenyangguang Zhang, Rui Liu, Wen Shen, Xiaolin Qin,
- Abstract summary: This paper focuses on a challenging setting of simultaneously modeling geometry and appearance of hand-object interaction scenes without any object priors.<n>We present interaction-aware hand-object Gaussians with newly introduced optimizable parameters aiming to adopt piecewise linear hypothesis for clearer structural representation.<n> Experiments show that our approach surpasses existing dynamic 3D-GS-based methods and achieves state-of-the-art performance in reconstructing dynamic hand-object interaction.
- Score: 19.178735596766476
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper focuses on a challenging setting of simultaneously modeling geometry and appearance of hand-object interaction scenes without any object priors. We follow the trend of dynamic 3D Gaussian Splatting based methods, and address several significant challenges. To model complex hand-object interaction with mutual occlusion and edge blur, we present interaction-aware hand-object Gaussians with newly introduced optimizable parameters aiming to adopt piecewise linear hypothesis for clearer structural representation. Moreover, considering the complementarity and tightness of hand shape and object shape during interaction dynamics, we incorporate hand information into object deformation field, constructing interaction-aware dynamic fields to model flexible motions. To further address difficulties in the optimization process, we propose a progressive strategy that handles dynamic regions and static background step by step. Correspondingly, explicit regularizations are designed to stabilize the hand-object representations for smooth motion transition, physical interaction reality, and coherent lighting. Experiments show that our approach surpasses existing dynamic 3D-GS-based methods and achieves state-of-the-art performance in reconstructing dynamic hand-object interaction.
Related papers
- Follow My Hold: Hand-Object Interaction Reconstruction through Geometric Guidance [61.41904916189093]
We propose a novel diffusion-based framework for reconstructing 3D geometry of hand-held objects from monocular RGB images.<n>We use hand-object interaction as geometric guidance to ensure plausible hand-object interactions.
arXiv Detail & Related papers (2025-08-25T17:11:53Z) - DynaSplat: Dynamic-Static Gaussian Splatting with Hierarchical Motion Decomposition for Scene Reconstruction [9.391616497099422]
We present DynaSplat, an approach that extends Gaussian Splatting to dynamic scenes.<n>We classify scene elements as static or dynamic through a novel fusion of deformation offset statistics and 2D motion flow consistency.<n>We then introduce a hierarchical motion modeling strategy that captures both coarse global transformations and fine-grained local movements.
arXiv Detail & Related papers (2025-06-11T15:13:35Z) - Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control [72.00655365269]
We present RoboMaster, a novel framework that models inter-object dynamics through a collaborative trajectory formulation.<n>Unlike prior methods that decompose objects, our core is to decompose the interaction process into three sub-stages: pre-interaction, interaction, and post-interaction.<n>Our method outperforms existing approaches, establishing new state-of-the-art performance in trajectory-controlled video generation for robotic manipulation.
arXiv Detail & Related papers (2025-06-02T17:57:06Z) - Guiding Human-Object Interactions with Rich Geometry and Relations [21.528466852204627]
Existing methods often rely on simplified object representations, such as the object's centroid or the nearest point to a human, to achieve physically plausible motions.<n>We introduce ROG, a novel framework that addresses relationships inherent in HOIs with rich geometric detail.<n>We show that ROG significantly outperforms state-of-the-art methods in the realism evaluations and semantic accuracy of synthesized HOIs.
arXiv Detail & Related papers (2025-03-26T02:57:18Z) - Event-boosted Deformable 3D Gaussians for Dynamic Scene Reconstruction [50.873820265165975]
We introduce the first approach combining event cameras, which capture high-temporal-resolution, continuous motion data, with deformable 3D-GS for dynamic scene reconstruction.<n>We propose a GS-Threshold Joint Modeling strategy, creating a mutually reinforcing process that greatly improves both 3D reconstruction and threshold modeling.<n>We contribute the first event-inclusive 4D benchmark with synthetic and real-world dynamic scenes, on which our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-11-25T08:23:38Z) - Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation [47.940270914254285]
ViTaM-D is a visual-tactile framework for reconstructing dynamic hand-object interaction with distributed tactile sensing.<n> DF-Field is a force-aware contact representation leveraging kinetic and potential energy in hand-object interactions.<n>ViTaM-D outperforms state-of-the-art methods in reconstruction accuracy for both rigid and deformable objects.
arXiv Detail & Related papers (2024-11-14T16:29:45Z) - EgoGaussian: Dynamic Scene Understanding from Egocentric Video with 3D Gaussian Splatting [95.44545809256473]
EgoGaussian is a method capable of simultaneously reconstructing 3D scenes and dynamically tracking 3D object motion from RGB egocentric input alone.
We show significant improvements in terms of both dynamic object and background reconstruction quality compared to the state-of-the-art.
arXiv Detail & Related papers (2024-06-28T10:39:36Z) - THOR: Text to Human-Object Interaction Diffusion via Relation Intervention [51.02435289160616]
We propose a novel Text-guided Human-Object Interaction diffusion model with Relation Intervention (THOR)
In each diffusion step, we initiate text-guided human and object motion and then leverage human-object relations to intervene in object motion.
We construct Text-BEHAVE, a Text2HOI dataset that seamlessly integrates textual descriptions with the currently largest publicly available 3D HOI dataset.
arXiv Detail & Related papers (2024-03-17T13:17:25Z) - Hand-Centric Motion Refinement for 3D Hand-Object Interaction via
Hierarchical Spatial-Temporal Modeling [18.128376292350836]
We propose a data-driven method for coarse hand motion refinement.
First, we design a hand-centric representation to describe the dynamic spatial-temporal relation between hands and objects.
Second, to capture the dynamic clues of hand-object interaction, we propose a new architecture.
arXiv Detail & Related papers (2024-01-29T09:17:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.