MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh
Reconstruction
- URL: http://arxiv.org/abs/2211.13357v1
- Date: Thu, 24 Nov 2022 00:02:13 GMT
- Title: MPT: Mesh Pre-Training with Transformers for Human Pose and Mesh
Reconstruction
- Authors: Kevin Lin, Chung-Ching Lin, Lin Liang, Zicheng Liu, Lijuan Wang
- Abstract summary: Mesh Pre-Training (MPT) is a new pre-training framework that leverages 3D mesh data such as MoCap data for human pose and mesh reconstruction from a single image.
MPT enables transformer models to have zero-shot capability of human mesh reconstruction from real images.
- Score: 56.80384196339199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present Mesh Pre-Training (MPT), a new pre-training framework that
leverages 3D mesh data such as MoCap data for human pose and mesh
reconstruction from a single image. Existing work in 3D pose and mesh
reconstruction typically requires image-mesh pairs as the training data, but
the acquisition of 2D-to-3D annotations is difficult. In this paper, we explore
how to leverage 3D mesh data such as MoCap data, that does not have RGB images,
for pre-training. The key idea is that even though 3D mesh data cannot be used
for end-to-end training due to a lack of the corresponding RGB images, it can
be used to pre-train the mesh regression transformer subnetwork. We observe
that such pre-training not only improves the accuracy of mesh reconstruction
from a single image, but also enables zero-shot capability. We conduct mesh
pre-training using 2 million meshes. Experimental results show that MPT
advances the state-of-the-art results on Human3.6M and 3DPW datasets. We also
show that MPT enables transformer models to have zero-shot capability of human
mesh reconstruction from real images. In addition, we demonstrate the
generalizability of MPT to 3D hand reconstruction, achieving state-of-the-art
results on FreiHAND dataset.
Related papers
- Sampling is Matter: Point-guided 3D Human Mesh Reconstruction [0.0]
This paper presents a simple yet powerful method for 3D human mesh reconstruction from a single RGB image.
Experimental results on benchmark datasets show that the proposed method efficiently improves the performance of 3D human mesh reconstruction.
arXiv Detail & Related papers (2023-04-19T08:45:26Z) - Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape [77.95154911528365]
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori.
Previous reconstructed 3D faces suffer from degraded visual verisimilitude due to the loss of fine-grained geometry.
This paper proposes a complete solution to capture the personalized shape so that the reconstructed shape looks identical to the corresponding person.
arXiv Detail & Related papers (2022-04-09T03:46:18Z) - Adversarial Parametric Pose Prior [106.12437086990853]
We learn a prior that restricts the SMPL parameters to values that produce realistic poses via adversarial training.
We show that our learned prior covers the diversity of the real-data distribution, facilitates optimization for 3D reconstruction from 2D keypoints, and yields better pose estimates when used for regression from images.
arXiv Detail & Related papers (2021-12-08T10:05:32Z) - Using Adaptive Gradient for Texture Learning in Single-View 3D
Reconstruction [0.0]
Learning-based approaches for 3D model reconstruction have attracted attention owing to its modern applications.
We present a novel sampling algorithm by optimizing the gradient of predicted coordinates based on the variance on the sampling image.
We also adopt Frechet Inception Distance (FID) to form a loss function in learning, which helps bridging the gap between rendered images and input images.
arXiv Detail & Related papers (2021-04-29T07:52:54Z) - Model-based 3D Hand Reconstruction via Self-Supervised Learning [72.0817813032385]
Reconstructing a 3D hand from a single-view RGB image is challenging due to various hand configurations and depth ambiguity.
We propose S2HAND, a self-supervised 3D hand reconstruction network that can jointly estimate pose, shape, texture, and the camera viewpoint.
For the first time, we demonstrate the feasibility of training an accurate 3D hand reconstruction network without relying on manual annotations.
arXiv Detail & Related papers (2021-03-22T10:12:43Z) - PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D
Images/Videos [47.601288796052714]
We develop two novel Pose frameworks, i.e., Serial PC-HMR and Parallel PC-HMR.
Our frameworks are based on generic and complementary integration of data-driven learning and geometrical modeling.
We perform extensive experiments on the popular bench-marks, i.e., Human3.6M, 3DPW and SURREAL, where our PC-HMR frameworks achieve the SOTA results.
arXiv Detail & Related papers (2021-03-16T12:12:45Z) - An Effective Loss Function for Generating 3D Models from Single 2D Image
without Rendering [0.0]
Differentiable rendering is a very successful technique that applies to a Single-View 3D Reconstruction.
Currents use losses based on pixels between a rendered image of some 3D reconstructed object and ground-truth images from given matched viewpoints to optimise parameters of the 3D shape.
We propose a novel effective loss function that evaluates how well the projections of reconstructed 3D point clouds cover the ground truth object's silhouette.
arXiv Detail & Related papers (2021-03-05T00:02:18Z) - Im2Mesh GAN: Accurate 3D Hand Mesh Recovery from a Single RGB Image [31.371190180801452]
We show that the hand mesh can be learned directly from the input image.
We propose a new type of GAN called Im2Mesh GAN to learn the mesh through end-to-end adversarial training.
arXiv Detail & Related papers (2021-01-27T07:38:01Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.