BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass
Networks
- URL: http://arxiv.org/abs/2008.05079v1
- Date: Wed, 12 Aug 2020 03:13:17 GMT
- Title: BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass
Networks
- Authors: Lixin Yang, Jiasen Li, Wenqiang Xu, Yiqun Diao, Cewu Lu
- Abstract summary: We introduce an end-to-end learnable model, BiHand, which consists of three cascaded stages, namely 2D seeding stage, 3D lifting stage, and mesh generation stage.
At the output of BiHand, the full hand mesh will be recovered using the joint rotations and shape parameters predicted from the network.
Our model can achieve superior accuracy in comparison with state-of-the-art methods, and can produce appealing 3D hand meshes in several severe conditions.
- Score: 37.65510556305611
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: 3D hand estimation has been a long-standing research topic in computer
vision. A recent trend aims not only to estimate the 3D hand joint locations
but also to recover the mesh model. However, achieving those goals from a
single RGB image remains challenging. In this paper, we introduce an end-to-end
learnable model, BiHand, which consists of three cascaded stages, namely 2D
seeding stage, 3D lifting stage, and mesh generation stage. At the output of
BiHand, the full hand mesh will be recovered using the joint rotations and
shape parameters predicted from the network. Inside each stage, BiHand adopts a
novel bisecting design which allows the networks to encapsulate two closely
related information (e.g. 2D keypoints and silhouette in 2D seeding stage, 3D
joints, and depth map in 3D lifting stage, joint rotations and shape parameters
in the mesh generation stage) in a single forward pass. As the information
represents different geometry or structure details, bisecting the data flow can
facilitate optimization and increase robustness. For quantitative evaluation,
we conduct experiments on two public benchmarks, namely the Rendered Hand
Dataset (RHD) and the Stereo Hand Pose Tracking Benchmark (STB). Extensive
experiments show that our model can achieve superior accuracy in comparison
with state-of-the-art methods, and can produce appealing 3D hand meshes in
several severe conditions.
Related papers
- WiLoR: End-to-end 3D Hand Localization and Reconstruction in-the-wild [53.288327629960364]
We present a data-driven pipeline for efficient multi-hand reconstruction in the wild.
The proposed pipeline is composed of two components: a real-time fully convolutional hand localization and a high-fidelity transformer-based 3D hand reconstruction model.
Our approach outperforms previous methods in both efficiency and accuracy on popular 2D and 3D benchmarks.
arXiv Detail & Related papers (2024-09-18T18:46:51Z) - Reconstructing Hands in 3D with Transformers [64.15390309553892]
We present an approach that can reconstruct hands in 3D from monocular input.
Our approach for Hand Mesh Recovery, HaMeR, follows a fully transformer-based architecture and can analyze hands with significantly increased accuracy and robustness compared to previous work.
arXiv Detail & Related papers (2023-12-08T18:59:07Z) - Leveraging Large-Scale Pretrained Vision Foundation Models for
Label-Efficient 3D Point Cloud Segmentation [67.07112533415116]
We present a novel framework that adapts various foundational models for the 3D point cloud segmentation task.
Our approach involves making initial predictions of 2D semantic masks using different large vision models.
To generate robust 3D semantic pseudo labels, we introduce a semantic label fusion strategy that effectively combines all the results via voting.
arXiv Detail & Related papers (2023-11-03T15:41:15Z) - End-to-end Weakly-supervised Single-stage Multiple 3D Hand Mesh
Reconstruction from a Single RGB Image [9.238322841389994]
We propose a single-stage pipeline for multi-hand reconstruction.
Specifically, we design a multi-head auto-encoder structure, where each head network shares the same feature map and outputs the hand center, pose and texture.
Our method outperforms the state-of-the-art model-based methods in both weakly-supervised and fully-supervised manners.
arXiv Detail & Related papers (2022-04-18T03:57:14Z) - Consistent 3D Hand Reconstruction in Video via self-supervised Learning [67.55449194046996]
We present a method for reconstructing accurate and consistent 3D hands from a monocular video.
detected 2D hand keypoints and the image texture provide important cues about the geometry and texture of the 3D hand.
We propose $rm S2HAND$, a self-supervised 3D hand reconstruction model.
arXiv Detail & Related papers (2022-01-24T09:44:11Z) - Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction [57.3636347704271]
3D hand-mesh reconstruction from RGB images facilitates many applications, including augmented reality (AR)
This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages.
We can promote high-quality finger-level mesh-image alignment and drive the models together to deliver real-time predictions.
arXiv Detail & Related papers (2021-09-03T20:42:01Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - HOPE-Net: A Graph-based Model for Hand-Object Pose Estimation [7.559220068352681]
We propose a lightweight model called HOPE-Net which jointly estimates hand and object pose in 2D and 3D in real-time.
Our network uses a cascade of two adaptive graph convolutional neural networks, one to estimate 2D coordinates of the hand joints and object corners, followed by another to convert 2D coordinates to 3D.
arXiv Detail & Related papers (2020-03-31T19:01:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.