Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction
- URL: http://arxiv.org/abs/2109.01723v1
- Date: Fri, 3 Sep 2021 20:42:01 GMT
- Title: Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction
- Authors: Xiao Tang, Tianyu Wang, Chi-Wing Fu
- Abstract summary: 3D hand-mesh reconstruction from RGB images facilitates many applications, including augmented reality (AR)
This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages.
We can promote high-quality finger-level mesh-image alignment and drive the models together to deliver real-time predictions.
- Score: 57.3636347704271
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: 3D hand-mesh reconstruction from RGB images facilitates many applications,
including augmented reality (AR). However, this requires not only real-time
speed and accurate hand pose and shape but also plausible mesh-image alignment.
While existing works already achieve promising results, meeting all three
requirements is very challenging. This paper presents a novel pipeline by
decoupling the hand-mesh reconstruction task into three stages: a joint stage
to predict hand joints and segmentation; a mesh stage to predict a rough hand
mesh; and a refine stage to fine-tune it with an offset mesh for mesh-image
alignment. With careful design in the network structure and in the loss
functions, we can promote high-quality finger-level mesh-image alignment and
drive the models together to deliver real-time predictions. Extensive
quantitative and qualitative results on benchmark datasets demonstrate that the
quality of our results outperforms the state-of-the-art methods on
hand-mesh/pose precision and hand-image alignment. In the end, we also showcase
several real-time AR scenarios.
Related papers
- Fine-Grained Multi-View Hand Reconstruction Using Inverse Rendering [11.228453237603834]
We present a novel fine-grained multi-view hand mesh reconstruction method that leverages inverse rendering to restore hand poses and intricate details.
We also introduce a novel Hand Albedo and Mesh (HAM) optimization module to refine both the hand mesh and textures.
Our proposed approach outperforms the state-of-the-art methods on both reconstruction accuracy and rendering quality.
arXiv Detail & Related papers (2024-07-08T07:28:24Z) - HandS3C: 3D Hand Mesh Reconstruction with State Space Spatial Channel Attention from RGB images [4.252549987351642]
We propose a simple but effective 3D hand mesh reconstruction network (i.e., HandS3C)
In the network, we design a novel state-space spatial-channel attention module that extends the effective receptive field.
Our proposed HandS3C achieves state-of-the-art performance while maintaining a minimal parameters.
arXiv Detail & Related papers (2024-05-02T07:47:49Z) - SiMA-Hand: Boosting 3D Hand-Mesh Reconstruction by Single-to-Multi-View
Adaptation [90.59734612754222]
Estimating 3D hand mesh from RGB images is one of the most challenging problems.
Existing attempts towards this task often fail when the occlusion dominates the image space.
We propose SiMA-Hand, aiming to boost the mesh reconstruction performance by Single-to-Multi-view Adaptation.
arXiv Detail & Related papers (2024-02-02T13:14:20Z) - Mesh Represented Recycle Learning for 3D Hand Pose and Mesh Estimation [3.126179109712709]
We propose a mesh represented recycle learning strategy for 3D hand pose and mesh estimation.
To be specific, a hand pose and mesh estimation model first predicts parametric 3D hand annotations.
Second, synthetic hand images are generated with self-estimated hand mesh representations.
Third, the synthetic hand images are fed into the same model again.
arXiv Detail & Related papers (2023-10-18T09:50:09Z) - Overcoming the Trade-off Between Accuracy and Plausibility in 3D Hand
Shape Reconstruction [62.96478903239799]
Direct mesh fitting for 3D hand shape reconstruction is highly accurate.
However, the reconstructed meshes are prone to artifacts and do not appear as plausible hand shapes.
We introduce a novel weakly-supervised hand shape estimation framework that integrates non-parametric mesh fitting with MANO model in an end-to-end fashion.
arXiv Detail & Related papers (2023-05-01T03:38:01Z) - Im2Mesh GAN: Accurate 3D Hand Mesh Recovery from a Single RGB Image [31.371190180801452]
We show that the hand mesh can be learned directly from the input image.
We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training.
arXiv Detail & Related papers (2021-01-27T07:38:01Z) - SCFusion: Real-time Incremental Scene Reconstruction with Semantic
Completion [86.77318031029404]
We propose a framework that performs scene reconstruction and semantic scene completion jointly in an incremental and real-time manner.
Our framework relies on a novel neural architecture designed to process occupancy maps and leverages voxel states to accurately and efficiently fuse semantic completion with the 3D global model.
arXiv Detail & Related papers (2020-10-26T15:31:52Z) - BiHand: Recovering Hand Mesh with Multi-stage Bisected Hourglass
Networks [37.65510556305611]
We introduce an end-to-end learnable model, BiHand, which consists of three cascaded stages, namely 2D seeding stage, 3D lifting stage, and mesh generation stage.
At the output of BiHand, the full hand mesh will be recovered using the joint rotations and shape parameters predicted from the network.
Our model can achieve superior accuracy in comparison with state-of-the-art methods, and can produce appealing 3D hand meshes in several severe conditions.
arXiv Detail & Related papers (2020-08-12T03:13:17Z) - Leveraging Photometric Consistency over Time for Sparsely Supervised
Hand-Object Reconstruction [118.21363599332493]
We present a method to leverage photometric consistency across time when annotations are only available for a sparse subset of frames in a video.
Our model is trained end-to-end on color images to jointly reconstruct hands and objects in 3D by inferring their poses.
We achieve state-of-the-art results on 3D hand-object reconstruction benchmarks and demonstrate that our approach allows us to improve the pose estimation accuracy.
arXiv Detail & Related papers (2020-04-28T12:03:14Z) - Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild [59.158592526006814]
We train our network by gathering a large-scale dataset of hand action in YouTube videos.
Our weakly-supervised mesh convolutions-based system largely outperforms state-of-the-art methods, even halving the errors on the in the wild benchmark.
arXiv Detail & Related papers (2020-04-04T14:35:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.