Semantics2Hands: Transferring Hand Motion Semantics between Avatars
- URL: http://arxiv.org/abs/2308.05920v1
- Date: Fri, 11 Aug 2023 03:07:31 GMT
- Title: Semantics2Hands: Transferring Hand Motion Semantics between Avatars
- Authors: Zijie Ye, Jia Jia and Junliang Xing
- Abstract summary: Even minor errors in hand motions can significantly impact the user experience.
This paper introduces a novel anatomy-based semantic matrix (ASM) that encodes the semantics of hand motions.
We train the ASM using a semi-supervised learning strategy on the Mixamo and InterHand2.6M datasets.
- Score: 34.39785320233128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Human hands, the primary means of non-verbal communication, convey intricate
semantics in various scenarios. Due to the high sensitivity of individuals to
hand motions, even minor errors in hand motions can significantly impact the
user experience. Real applications often involve multiple avatars with varying
hand shapes, highlighting the importance of maintaining the intricate semantics
of hand motions across the avatars. Therefore, this paper aims to transfer the
hand motion semantics between diverse avatars based on their respective hand
models. To address this problem, we introduce a novel anatomy-based semantic
matrix (ASM) that encodes the semantics of hand motions. The ASM quantifies the
positions of the palm and other joints relative to the local frame of the
corresponding joint, enabling precise retargeting of hand motions.
Subsequently, we obtain a mapping function from the source ASM to the target
hand joint rotations by employing an anatomy-based semantics reconstruction
network (ASRN). We train the ASRN using a semi-supervised learning strategy on
the Mixamo and InterHand2.6M datasets. We evaluate our method in intra-domain
and cross-domain hand motion retargeting tasks. The qualitative and
quantitative results demonstrate the significant superiority of our ASRN over
the state-of-the-arts.
Related papers
- GRIP: Generating Interaction Poses Using Spatial Cues and Latent Consistency [57.9920824261925]
Hands are dexterous and highly versatile manipulators that are central to how humans interact with objects and their environment.
modeling realistic hand-object interactions is critical for applications in computer graphics, computer vision, and mixed reality.
GRIP is a learning-based method that takes as input the 3D motion of the body and the object, and synthesizes realistic motion for both hands before, during, and after object interaction.
arXiv Detail & Related papers (2023-08-22T17:59:51Z) - Local Spherical Harmonics Improve Skeleton-Based Hand Action Recognition [17.62840662799232]
We propose a method specifically designed for hand action recognition which uses relative angular embeddings and local Spherical Harmonics to create novel hand representations.
The use of Spherical Harmonics creates rotation-invariant representations which make hand action recognition even more robust against inter-subject differences and viewpoint changes.
arXiv Detail & Related papers (2023-08-21T08:17:42Z) - HandNeRF: Neural Radiance Fields for Animatable Interacting Hands [122.32855646927013]
We propose a novel framework to reconstruct accurate appearance and geometry with neural radiance fields (NeRF) for interacting hands.
We conduct extensive experiments to verify the merits of our proposed HandNeRF and report a series of state-of-the-art results.
arXiv Detail & Related papers (2023-03-24T06:19:19Z) - Recognizing Hand Use and Hand Role at Home After Stroke from Egocentric
Video [0.0]
Egocentric video can capture hand-object interactions in context, as well as show how more-affected hands are used.
To use artificial intelligence-based computer vision to classify hand use and hand role from egocentric videos recorded at home after stroke.
arXiv Detail & Related papers (2022-07-18T20:15:29Z) - Snapture -- A Novel Neural Architecture for Combined Static and Dynamic
Hand Gesture Recognition [19.320551882950706]
We propose a novel hybrid hand gesture recognition system.
Our architecture enables learning both static and dynamic gestures.
Our work contributes both to gesture recognition research and machine learning applications for non-verbal communication with robots.
arXiv Detail & Related papers (2022-05-28T11:12:38Z) - Domain Adaptive Robotic Gesture Recognition with Unsupervised
Kinematic-Visual Data Alignment [60.31418655784291]
We propose a novel unsupervised domain adaptation framework which can simultaneously transfer multi-modality knowledge, i.e., both kinematic and visual data, from simulator to real robot.
It remedies the domain gap with enhanced transferable features by using temporal cues in videos, and inherent correlations in multi-modal towards recognizing gesture.
Results show that our approach recovers the performance with great improvement gains, up to 12.91% in ACC and 20.16% in F1score without using any annotations in real robot.
arXiv Detail & Related papers (2021-03-06T09:10:03Z) - Generalization Through Hand-Eye Coordination: An Action Space for
Learning Spatially-Invariant Visuomotor Control [67.23580984118479]
Imitation Learning (IL) is an effective framework to learn visuomotor skills from offline demonstration data.
Hand-eye Action Networks (HAN) can approximate human's hand-eye coordination behaviors by learning from human teleoperated demonstrations.
arXiv Detail & Related papers (2021-02-28T01:49:13Z) - Joint Hand-object 3D Reconstruction from a Single Image with
Cross-branch Feature Fusion [78.98074380040838]
We propose to consider hand and object jointly in feature space and explore the reciprocity of the two branches.
We employ an auxiliary depth estimation module to augment the input RGB image with the estimated depth map.
Our approach significantly outperforms existing approaches in terms of the reconstruction accuracy of objects.
arXiv Detail & Related papers (2020-06-28T09:50:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.