4DHands: Reconstructing Interactive Hands in 4D with Transformers
- URL: http://arxiv.org/abs/2405.20330v2
- Date: Fri, 31 May 2024 10:52:56 GMT
- Title: 4DHands: Reconstructing Interactive Hands in 4D with Transformers
- Authors: Dixuan Lin, Yuxiang Zhang, Mengcheng Li, Yebin Liu, Wei Jing, Qi Yan, Qianying Wang, Hongwen Zhang,
- Abstract summary: We introduce 4DHands, a robust approach to recovering interactive hand meshes and their relative movement from monocular inputs.
We develop a transformer-based architecture with novel tokenization and feature fusion strategies.
The efficacy of our approach is validated on several benchmark datasets.
- Score: 35.983309206845036
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we introduce 4DHands, a robust approach to recovering interactive hand meshes and their relative movement from monocular inputs. Our approach addresses two major limitations of previous methods: lacking a unified solution for handling various hand image inputs and neglecting the positional relationship of two hands within images. To overcome these challenges, we develop a transformer-based architecture with novel tokenization and feature fusion strategies. Specifically, we propose a Relation-aware Two-Hand Tokenization (RAT) method to embed positional relation information into the hand tokens. In this way, our network can handle both single-hand and two-hand inputs and explicitly leverage relative hand positions, facilitating the reconstruction of intricate hand interactions in real-world scenarios. As such tokenization indicates the relative relationship of two hands, it also supports more effective feature fusion. To this end, we further develop a Spatio-temporal Interaction Reasoning (SIR) module to fuse hand tokens in 4D with attention and decode them into 3D hand meshes and relative temporal movements. The efficacy of our approach is validated on several benchmark datasets. The results on in-the-wild videos and real-world scenarios demonstrate the superior performances of our approach for interactive hand reconstruction. More video results can be found on the project page: https://4dhands.github.io.
Related papers
- HandDiffuse: Generative Controllers for Two-Hand Interactions via
Diffusion Models [48.56319454887096]
Existing hands datasets are largely short-range and the interaction is weak due to the self-occlusion and self-similarity of hands.
To rescue the data scarcity, we propose HandDiffuse12.5M, a novel dataset that consists of temporal sequences with strong two-hand interactions.
arXiv Detail & Related papers (2023-12-08T07:07:13Z) - Novel-view Synthesis and Pose Estimation for Hand-Object Interaction
from Sparse Views [41.50710846018882]
We propose a neural rendering and pose estimation system for hand-object interaction from sparse views.
We first learn the shape and appearance prior knowledge of hands and objects separately with the neural representation.
During the online stage, we design a rendering-based joint model fitting framework to understand the dynamic hand-object interaction.
arXiv Detail & Related papers (2023-08-22T05:17:41Z) - ACR: Attention Collaboration-based Regressor for Arbitrary Two-Hand
Reconstruction [30.073586754012645]
We present ACR (Attention Collaboration-based Regressor), which makes the first attempt to reconstruct hands in arbitrary scenarios.
We evaluate our method on various types of hand reconstruction datasets.
arXiv Detail & Related papers (2023-03-10T14:19:02Z) - Im2Hands: Learning Attentive Implicit Representation of Interacting
Two-Hand Shapes [58.551154822792284]
Implicit Two Hands (Im2Hands) is the first neural implicit representation of two interacting hands.
Im2Hands can produce fine-grained geometry of two hands with high hand-to-hand and hand-to-image coherency.
We experimentally demonstrate the effectiveness of Im2Hands on two-hand reconstruction in comparison to related methods.
arXiv Detail & Related papers (2023-02-28T06:38:25Z) - Decoupled Iterative Refinement Framework for Interacting Hands
Reconstruction from a Single RGB Image [30.24438569170251]
We propose a decoupled iterative refinement framework to achieve pixel-alignment hand reconstruction.
Our method outperforms all existing two-hand reconstruction methods by a large margin on the InterHand2.6M dataset.
arXiv Detail & Related papers (2023-02-05T15:46:57Z) - Hand-Object Interaction Reasoning [33.612083150296364]
We show that modelling two-handed interactions are critical for action recognition in ego-encoded video.
We propose an interaction reasoning network for modelling-temporal relationships between hands and objects in video.
arXiv Detail & Related papers (2022-01-13T11:53:12Z) - Monocular 3D Reconstruction of Interacting Hands via Collision-Aware
Factorized Refinements [96.40125818594952]
We make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.
Our method can generate 3D hand meshes with both precise 3D poses and minimal collisions.
arXiv Detail & Related papers (2021-11-01T08:24:10Z) - Real-time Pose and Shape Reconstruction of Two Interacting Hands With a
Single Depth Camera [79.41374930171469]
We present a novel method for real-time pose and shape reconstruction of two strongly interacting hands.
Our approach combines an extensive list of favorable properties, namely it is marker-less.
We show state-of-the-art results in scenes that exceed the complexity level demonstrated by previous work.
arXiv Detail & Related papers (2021-06-15T11:39:49Z) - Joint Hand-object 3D Reconstruction from a Single Image with
Cross-branch Feature Fusion [78.98074380040838]
We propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.
We employ an auxiliary depth estimation module to augment the input RGB image with the estimated depth map.
Our approach significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
arXiv Detail & Related papers (2020-06-28T09:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.