Im2Mesh GAN: Accurate 3D Hand Mesh Recovery from a Single RGB Image
- URL: http://arxiv.org/abs/2101.11239v1
- Date: Wed, 27 Jan 2021 07:38:01 GMT
- Title: Im2Mesh GAN: Accurate 3D Hand Mesh Recovery from a Single RGB Image
- Authors: Akila Pemasiri, Kien Nguyen Thanh, Sridha Sridharan, Clinton Fookes
- Abstract summary: We show that the hand mesh can be learned directly from the input image.
We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training.
- Score: 31.371190180801452
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work addresses hand mesh recovery from a single RGB image. In contrast
to most of the existing approaches where the parametric hand models are
employed as the prior, we show that the hand mesh can be learned directly from
the input image. We propose a new type of GAN called Im2Mesh GAN to learn the
mesh through end-to-end adversarial training. By interpreting the mesh as a
graph, our model is able to capture the topological relationship among the mesh
vertices. We also introduce a 3D surface descriptor into the GAN architecture
to further capture the 3D features associated. We experiment two approaches
where one can reap the benefits of coupled groundtruth data availability of
images and the corresponding meshes, while the other combats the more
challenging problem of mesh estimations without the corresponding groundtruth.
Through extensive evaluations we demonstrate that the proposed method
outperforms the state-of-the-art.
Related papers
- Towards Human-Level 3D Relative Pose Estimation: Generalizable, Training-Free, with Single Reference [62.99706119370521]
Humans can easily deduce the relative pose of an unseen object, without label/training, given only a single query-reference image pair.
We propose a novel 3D generalizable relative pose estimation method by elaborating (i) with a 2.5D shape from an RGB-D reference, (ii) with an off-the-shelf differentiable, and (iii) with semantic cues from a pretrained model like DINOv2.
arXiv Detail & Related papers (2024-06-26T16:01:10Z) - Direct Learning of Mesh and Appearance via 3D Gaussian Splatting [3.4899193297791054]
We propose a learnable scene model that incorporates 3DGS with an explicit geometry representation, namely a mesh.
Our model learns the mesh and appearance in an end-to-end manner, where we bind 3D Gaussians to the mesh faces and perform differentiable rendering of 3DGS to obtain photometric supervision.
arXiv Detail & Related papers (2024-05-11T07:56:19Z) - Bridging 3D Gaussian and Mesh for Freeview Video Rendering [57.21847030980905]
GauMesh bridges the 3D Gaussian and Mesh for modeling and rendering the dynamic scenes.
We show that our approach adapts the appropriate type of primitives to represent the different parts of the dynamic scene.
arXiv Detail & Related papers (2024-03-18T04:01:26Z) - Monocular 3D Hand Mesh Recovery via Dual Noise Estimation [47.82179706128616]
We introduce a dual noise estimation method to generate meshes that are aligned with the image well.
Our method achieves state-of-the-art performance on a large-scale Interhand2.6M dataset.
arXiv Detail & Related papers (2023-12-26T07:21:01Z) - Weakly-Supervised 3D Reconstruction of Clothed Humans via Normal Maps [1.6462601662291156]
We present a novel deep learning-based approach to the 3D reconstruction of clothed humans using weak supervision via 2D normal maps.
Given a single RGB image or multiview images, our network infers a signed distance function (SDF) discretized on a tetrahedral mesh surrounding the body in a rest pose.
We demonstrate the efficacy of our approach for both network inference and 3D reconstruction.
arXiv Detail & Related papers (2023-11-27T18:06:35Z) - Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh
Reconstruction [56.80384196339199]
Mesh Pre-Training (MPT) is a new pre-training framework that leverages 3D mesh data such as MoCap data for human pose and mesh reconstruction from a single image.
MPT enables transformer models to have zero-shot capability of human mesh reconstruction from real images.
arXiv Detail & Related papers (2022-11-24T00:02:13Z) - Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction [57.3636347704271]
3D hand-mesh reconstruction from RGB images facilitates many applications, including augmented reality (AR)
This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages.
We can promote high-quality finger-level mesh-image alignment and drive the models together to deliver real-time predictions.
arXiv Detail & Related papers (2021-09-03T20:42:01Z) - HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose
Estimation from a Single Depth Map [72.93634777578336]
We propose a novel architecture with 3D convolutions trained in a weakly-supervised manner.
The proposed approach improves over the state of the art by 47.8% on the SynHand5M dataset.
Our method produces visually more reasonable and realistic hand shapes on NYU and BigHand2.2M datasets.
arXiv Detail & Related papers (2020-04-03T14:27:16Z) - Learning Nonparametric Human Mesh Reconstruction from a Single Image
without Ground Truth Meshes [56.27436157101251]
We propose a novel approach to learn human mesh reconstruction without any ground truth meshes.
This is made possible by introducing two new terms into the loss function of a graph convolutional neural network (Graph CNN)
arXiv Detail & Related papers (2020-02-28T20:30:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.