Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand
Disentanglement
- URL: http://arxiv.org/abs/2303.01765v2
- Date: Tue, 21 Mar 2023 02:50:41 GMT
- Title: Diverse 3D Hand Gesture Prediction from Body Dynamics by Bilateral Hand
Disentanglement
- Authors: Xingqun Qi, Chen Liu, Muyi Sun, Lincheng Li, Changjie Fan, Xin Yu
- Abstract summary: We introduce a novel bilateral hand disentanglement based two-stage 3D hand generation method.
In the first stage, we intend to generate natural hand gestures by two hand-disentanglement branches.
The second stage is built upon the insight that 3D hand predictions should be non-deterministic.
- Score: 42.98335775548796
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Predicting natural and diverse 3D hand gestures from the upper body dynamics
is a practical yet challenging task in virtual avatar creation. Previous works
usually overlook the asymmetric motions between two hands and generate two
hands in a holistic manner, leading to unnatural results. In this work, we
introduce a novel bilateral hand disentanglement based two-stage 3D hand
generation method to achieve natural and diverse 3D hand prediction from body
dynamics. In the first stage, we intend to generate natural hand gestures by
two hand-disentanglement branches. Considering the asymmetric gestures and
motions of two hands, we introduce a Spatial-Residual Memory (SRM) module to
model spatial interaction between the body and each hand by residual learning.
To enhance the coordination of two hand motions wrt. body dynamics
holistically, we then present a Temporal-Motion Memory (TMM) module. TMM can
effectively model the temporal association between body dynamics and two hand
motions. The second stage is built upon the insight that 3D hand predictions
should be non-deterministic given the sequential body postures. Thus, we
further diversify our 3D hand predictions based on the initial output from the
stage one. Concretely, we propose a Prototypical-Memory Sampling Strategy (PSS)
to generate the non-deterministic hand gestures by gradient-based Markov Chain
Monte Carlo (MCMC) sampling. Extensive experiments demonstrate that our method
outperforms the state-of-the-art models on the B2H dataset and our newly
collected TED Hands dataset. The dataset and code are available at
https://github.com/XingqunQi-lab/Diverse-3D-Hand-Gesture-Prediction.
Related papers
- HMP: Hand Motion Priors for Pose and Shape Estimation from Video [52.39020275278984]
We develop a generative motion prior specific for hands, trained on the AMASS dataset which features diverse and high-quality hand motions.
Our integration of a robust motion prior significantly enhances performance, especially in occluded scenarios.
We demonstrate our method's efficacy via qualitative and quantitative evaluations on the HO3D and DexYCB datasets.
arXiv Detail & Related papers (2023-12-27T22:35:33Z) - BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics [50.88842027976421]
We propose BOTH57M, a novel multi-modal dataset for two-hand motion generation.
Our dataset includes accurate motion tracking for the human body and hands.
We also provide a strong baseline method, BOTH2Hands, for the novel task.
arXiv Detail & Related papers (2023-12-13T07:30:19Z) - Generating Holistic 3D Human Motion from Speech [97.11392166257791]
We build a high-quality dataset of 3D holistic body meshes with synchronous speech.
We then define a novel speech-to-motion generation framework in which the face, body, and hands are modeled separately.
arXiv Detail & Related papers (2022-12-08T17:25:19Z) - A Non-Anatomical Graph Structure for isolated hand gesture separation in
continuous gesture sequences [42.20687552354674]
We propose a GCN model and combine it with the stacked Bi-LSTM and Attention modules to push the temporal information in the video stream.
Considering the breakthroughs of GCN models for skeleton modality, we propose a two-layer GCN model to empower the 3D hand skeleton features.
arXiv Detail & Related papers (2022-07-15T17:28:52Z) - Monocular 3D Reconstruction of Interacting Hands via Collision-Aware
Factorized Refinements [96.40125818594952]
We make the first attempt to reconstruct 3D interacting hands from monocular single RGB images.
Our method can generate 3D hand meshes with both precise 3D poses and minimal collisions.
arXiv Detail & Related papers (2021-11-01T08:24:10Z) - BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass
Networks [37.65510556305611]
We introduce an end-to-end learnable model, BiHand, which consists of three cascaded stages, namely 2D seeding stage, 3D lifting stage, and mesh generation stage.
At the output of BiHand, the full hand mesh will be recovered using the joint rotations and shape parameters predicted from the network.
Our model can achieve superior accuracy in comparison with state-of-the-art methods, and can produce appealing 3D hand meshes in several severe conditions.
arXiv Detail & Related papers (2020-08-12T03:13:17Z) - Body2Hands: Learning to Infer 3D Hands from Conversational Gesture Body
Dynamics [87.17505994436308]
We build upon the insight that body motion and hand gestures are strongly correlated in non-verbal communication settings.
We formulate the learning of this prior as a prediction task of 3D hand shape over time given body motion input alone.
Our hand prediction model produces convincing 3D hand gestures given only the 3D motion of the speaker's arms as input.
arXiv Detail & Related papers (2020-07-23T22:58:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.