DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for
High-fidelity Hand Mesh Modeling
- URL: http://arxiv.org/abs/2008.08213v1
- Date: Wed, 19 Aug 2020 00:59:51 GMT
- Title: DeepHandMesh: A Weakly-supervised Deep Encoder-Decoder Framework for
High-fidelity Hand Mesh Modeling
- Authors: Gyeongsik Moon, Takaaki Shiratori, Kyoung Mu Lee
- Abstract summary: DeepHandMesh is a weakly-supervised deep encoder-decoder framework for high-fidelity hand mesh modeling.
We show that our system can also be applied successfully to the 3D hand mesh estimation from general images.
- Score: 75.69585456580505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human hands play a central role in interacting with other people and objects.
For realistic replication of such hand motions, high-fidelity hand meshes have
to be reconstructed. In this study, we firstly propose DeepHandMesh, a
weakly-supervised deep encoder-decoder framework for high-fidelity hand mesh
modeling. We design our system to be trained in an end-to-end and
weakly-supervised manner; therefore, it does not require groundtruth meshes.
Instead, it relies on weaker supervisions such as 3D joint coordinates and
multi-view depth maps, which are easier to get than groundtruth meshes and do
not dependent on the mesh topology. Although the proposed DeepHandMesh is
trained in a weakly-supervised way, it provides significantly more realistic
hand mesh than previous fully-supervised hand models. Our newly introduced
penetration avoidance loss further improves results by replicating physical
interaction between hand parts. Finally, we demonstrate that our system can
also be applied successfully to the 3D hand mesh estimation from general
images. Our hand model, dataset, and codes are publicly available at
https://mks0601.github.io/DeepHandMesh/.
Related papers
- XHand: Real-time Expressive Hand Avatar [9.876680405587745]
We introduce an expressive hand avatar, named XHand, that is designed to generate hand shape, appearance, and deformations in real-time.
XHand is able to recover high-fidelity geometry and texture for hand animations across diverse poses in real-time.
arXiv Detail & Related papers (2024-07-30T17:49:21Z) - Reconstructing Hands in 3D with Transformers [64.15390309553892]
We present an approach that can reconstruct hands in 3D from monocular input.
Our approach for Hand Mesh Recovery, HaMeR, follows a fully transformer-based architecture and can analyze hands with significantly increased accuracy and robustness compared to previous work.
arXiv Detail & Related papers (2023-12-08T18:59:07Z) - End-to-end Weakly-supervised Single-stage Multiple 3D Hand Mesh
Reconstruction from a Single RGB Image [9.238322841389994]
We propose a single-stage pipeline for multi-hand reconstruction.
Specifically, we design a multi-head auto-encoder structure, where each head network shares the same feature map and outputs the hand center, pose and texture.
Our method outperforms the state-of-the-art model-based methods in both weakly-supervised and fully-supervised manners.
arXiv Detail & Related papers (2022-04-18T03:57:14Z) - HandOccNet: Occlusion-Robust 3D Hand Mesh Estimation Network [57.206129938611454]
We propose a novel 3D hand mesh estimation network HandOccNet.
By injecting the hand information to the occluded region, our HandOccNet reaches the state-of-the-art performance on 3D hand mesh benchmarks.
arXiv Detail & Related papers (2022-03-28T08:12:16Z) - HandVoxNet++: 3D Hand Shape and Pose Estimation using Voxel-Based Neural
Networks [71.09275975580009]
HandVoxNet++ is a voxel-based deep network with 3D and graph convolutions trained in a fully supervised manner.
HandVoxNet++ relies on two hand shape representations. The first one is the 3D voxelized grid of hand shape, which does not preserve the mesh topology.
We combine the advantages of both representations by aligning the hand surface to the voxelized hand shape either with a new neural Graph-Convolutions-based Mesh Registration (GCN-MeshReg) or classical segment-wise Non-Rigid Gravitational Approach (NRGA++) which
arXiv Detail & Related papers (2021-07-02T17:59:54Z) - MVHM: A Large-Scale Multi-View Hand Mesh Benchmark for Accurate 3D Hand
Pose Estimation [32.12879364117658]
Estimating 3D hand poses from a single RGB image is challenging because depth ambiguity leads the problem ill-posed.
We design a spin match algorithm that enables a rigid mesh model matching with any target mesh ground truth.
We present a multi-view hand pose estimation approach to verify that training a hand pose estimator with our generated dataset greatly enhances the performance.
arXiv Detail & Related papers (2020-12-06T07:55:08Z) - Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild [59.158592526006814]
We train our network by gathering a large-scale dataset of hand action in YouTube videos.
Our weakly-supervised mesh convolutions-based system largely outperforms state-of-the-art methods, even halving the errors on the in the wild benchmark.
arXiv Detail & Related papers (2020-04-04T14:35:37Z) - HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose
Estimation from a Single Depth Map [72.93634777578336]
We propose a novel architecture with 3D convolutions trained in a weakly-supervised manner.
The proposed approach improves over the state of the art by 47.8% on the SynHand5M dataset.
Our method produces visually more reasonable and realistic hand shapes on NYU and BigHand2.2M datasets.
arXiv Detail & Related papers (2020-04-03T14:27:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.