HandDiffuse: Generative Controllers for Two-Hand Interactions via
Diffusion Models
- URL: http://arxiv.org/abs/2312.04867v1
- Date: Fri, 8 Dec 2023 07:07:13 GMT
- Title: HandDiffuse: Generative Controllers for Two-Hand Interactions via
Diffusion Models
- Authors: Pei Lin, Sihang Xu, Hongdi Yang, Yiran Liu, Xin Chen, Jingya Wang,
Jingyi Yu, Lan Xu
- Abstract summary: Existing hands datasets are largely short-range and the interaction is weak due to the self-occlusion and self-similarity of hands.
To rescue the data scarcity, we propose HandDiffuse12.5M, a novel dataset that consists of temporal sequences with strong two-hand interactions.
- Score: 48.56319454887096
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing hands datasets are largely short-range and the interaction is weak
due to the self-occlusion and self-similarity of hands, which can not yet fit
the need for interacting hands motion generation. To rescue the data scarcity,
we propose HandDiffuse12.5M, a novel dataset that consists of temporal
sequences with strong two-hand interactions. HandDiffuse12.5M has the largest
scale and richest interactions among the existing two-hand datasets. We further
present a strong baseline method HandDiffuse for the controllable motion
generation of interacting hands using various controllers. Specifically, we
apply the diffusion model as the backbone and design two motion representations
for different controllers. To reduce artifacts, we also propose Interaction
Loss which explicitly quantifies the dynamic interaction process. Our
HandDiffuse enables various applications with vivid two-hand interactions,
i.e., motion in-betweening and trajectory control. Experiments show that our
method outperforms the state-of-the-art techniques in motion generation and can
also contribute to data augmentation for other datasets. Our dataset,
corresponding codes, and pre-trained models will be disseminated to the
community for future research towards two-hand interaction modeling.
Related papers
- HOGSA: Bimanual Hand-Object Interaction Understanding with 3D Gaussian Splatting Based Data Augmentation [29.766317710266765]
We propose a new 3D Gaussian Splatting based data augmentation framework for bimanual hand-object interaction.
We use mesh-based 3DGS to model objects and hands, and to deal with the rendering blur problem due to multi-resolution input images used.
We extend the single hand grasping pose optimization module for the bimanual hand object to generate various poses of bimanual hand-object interaction.
arXiv Detail & Related papers (2025-01-06T08:48:17Z) - InterDance:Reactive 3D Dance Generation with Realistic Duet Interactions [67.37790144477503]
We propose InterDance, a large-scale duet dance dataset that significantly enhances motion quality, data scale, and the variety of dance genres.
We introduce a diffusion-based framework with an interaction refinement guidance strategy to optimize the realism of interactions progressively.
arXiv Detail & Related papers (2024-12-22T11:53:51Z) - Dynamic Reconstruction of Hand-Object Interaction with Distributed Force-aware Contact Representation [52.36691633451968]
ViTaM-D is a visual-tactile framework for dynamic hand-object interaction reconstruction.
DF-Field is a distributed force-aware contact representation model.
Our results highlight the superior performance of ViTaM-D in both rigid and deformable object reconstruction.
arXiv Detail & Related papers (2024-11-14T16:29:45Z) - Gaze-guided Hand-Object Interaction Synthesis: Dataset and Method [61.19028558470065]
We present GazeHOI, the first dataset to capture simultaneous 3D modeling of gaze, hand, and object interactions.
To tackle these issues, we propose a stacked gaze-guided hand-object interaction diffusion model, named GHO-Diffusion.
We also introduce HOI-Manifold Guidance during the sampling stage of GHO-Diffusion, enabling fine-grained control over generated motions.
arXiv Detail & Related papers (2024-03-24T14:24:13Z) - Learning Mutual Excitation for Hand-to-Hand and Human-to-Human Interaction Recognition [21.007782102151282]
We propose a mutual excitation graph convolutional network (me-GCN) by stacking mutual excitation graph convolution layers.
Me-GC learns mutual information in each layer and each stage of graph convolution operations.
Our proposed me-GC outperforms state-of-the-art GCN-based and Transformer-based methods.
arXiv Detail & Related papers (2024-02-04T10:00:00Z) - BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics [50.88842027976421]
We propose BOTH57M, a novel multi-modal dataset for two-hand motion generation.
Our dataset includes accurate motion tracking for the human body and hands.
We also provide a strong baseline method, BOTH2Hands, for the novel task.
arXiv Detail & Related papers (2023-12-13T07:30:19Z) - InterControl: Zero-shot Human Interaction Generation by Controlling Every Joint [67.6297384588837]
We introduce a novel controllable motion generation method, InterControl, to encourage the synthesized motions maintaining the desired distance between joint pairs.
We demonstrate that the distance between joint pairs for human-wise interactions can be generated using an off-the-shelf Large Language Model.
arXiv Detail & Related papers (2023-11-27T14:32:33Z) - Controllable Motion Synthesis and Reconstruction with Autoregressive
Diffusion Models [18.50942770933098]
MoDiff is an autoregressive probabilistic diffusion model over motion sequences conditioned on control contexts of other modalities.
Our model integrates a cross-modal Transformer encoder and a Transformer-based decoder, which are found effective in capturing temporal correlations in motion and control modalities.
arXiv Detail & Related papers (2023-04-03T08:17:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.