Monocular 3D Hand Mesh Recovery via Dual Noise Estimation
- URL: http://arxiv.org/abs/2312.15916v1
- Date: Tue, 26 Dec 2023 07:21:01 GMT
- Title: Monocular 3D Hand Mesh Recovery via Dual Noise Estimation
- Authors: Hanhui Li, Xiaojian Lin, Xuan Huang, Zejun Yang, Zhisheng Wang,
Xiaodan Liang
- Abstract summary: We introduce a dual noise estimation method to generate meshes that are aligned with the image well.
Our method achieves state-of-the-art performance on a large-scale Interhand2.6M dataset.
- Score: 47.82179706128616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Current parametric models have made notable progress in 3D hand pose and
shape estimation. However, due to the fixed hand topology and complex hand
poses, current models are hard to generate meshes that are aligned with the
image well. To tackle this issue, we introduce a dual noise estimation method
in this paper. Given a single-view image as input, we first adopt a baseline
parametric regressor to obtain the coarse hand meshes. We assume the mesh
vertices and their image-plane projections are noisy, and can be associated in
a unified probabilistic model. We then learn the distributions of noise to
refine mesh vertices and their projections. The refined vertices are further
utilized to refine camera parameters in a closed-form manner. Consequently, our
method obtains well-aligned and high-quality 3D hand meshes. Extensive
experiments on the large-scale Interhand2.6M dataset demonstrate that the
proposed method not only improves the performance of its baseline by more than
10$\%$ but also achieves state-of-the-art performance. Project page:
\url{https://github.com/hanhuili/DNE4Hand}.
Related papers
- No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images [100.80376573969045]
NoPoSplat is a feed-forward model capable of reconstructing 3D scenes parameterized by 3D Gaussians from multi-view images.
Our model achieves real-time 3D Gaussian reconstruction during inference.
This work makes significant advances in pose-free generalizable 3D reconstruction and demonstrates its applicability to real-world scenarios.
arXiv Detail & Related papers (2024-10-31T17:58:22Z) - GSD: View-Guided Gaussian Splatting Diffusion for 3D Reconstruction [52.04103235260539]
We present a diffusion model approach based on Gaussian Splatting representation for 3D object reconstruction from a single view.
The model learns to generate 3D objects represented by sets of GS ellipsoids.
The final reconstructed objects explicitly come with high-quality 3D structure and texture, and can be efficiently rendered in arbitrary views.
arXiv Detail & Related papers (2024-07-05T03:43:08Z) - HandDiff: 3D Hand Pose Estimation with Diffusion on Image-Point Cloud [60.47544798202017]
Hand pose estimation is a critical task in various human-computer interaction applications.
This paper proposes HandDiff, a diffusion-based hand pose estimation model that iteratively denoises accurate hand pose conditioned on hand-shaped image-point clouds.
Experimental results demonstrate that the proposed HandDiff significantly outperforms the existing approaches on four challenging hand pose benchmark datasets.
arXiv Detail & Related papers (2024-04-04T02:15:16Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - MeshDiffusion: Score-based Generative 3D Mesh Modeling [68.40770889259143]
We consider the task of generating realistic 3D shapes for automatic scene generation and physical simulation.
We take advantage of the graph structure of meshes and use a simple yet very effective generative modeling method to generate 3D meshes.
Specifically, we represent meshes with deformable tetrahedral grids, and then train a diffusion model on this direct parametrization.
arXiv Detail & Related papers (2023-03-14T17:59:01Z) - Im2Mesh GAN: Accurate 3D Hand Mesh Recovery from a Single RGB Image [31.371190180801452]
We show that the hand mesh can be learned directly from the input image.
We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training.
arXiv Detail & Related papers (2021-01-27T07:38:01Z) - MVHM: A Large-Scale Multi-View Hand Mesh Benchmark for Accurate 3D Hand
Pose Estimation [32.12879364117658]
Estimating 3D hand poses from a single RGB image is challenging because depth ambiguity leads the problem ill-posed.
We design a spin match algorithm that enables a rigid mesh model matching with any target mesh ground truth.
We present a multi-view hand pose estimation approach to verify that training a hand pose estimator with our generated dataset greatly enhances the performance.
arXiv Detail & Related papers (2020-12-06T07:55:08Z) - HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose
Estimation from a Single Depth Map [72.93634777578336]
We propose a novel architecture with 3D convolutions trained in a weakly-supervised manner.
The proposed approach improves over the state of the art by 47.8% on the SynHand5M dataset.
Our method produces visually more reasonable and realistic hand shapes on NYU and BigHand2.2M datasets.
arXiv Detail & Related papers (2020-04-03T14:27:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.