Towards Metrical Reconstruction of Human Faces
- URL: http://arxiv.org/abs/2204.06607v1
- Date: Wed, 13 Apr 2022 18:57:33 GMT
- Title: Towards Metrical Reconstruction of Human Faces
- Authors: Wojciech Zielonka and Timo Bolkart and Justus Thies
- Abstract summary: We argue for a supervised training scheme to learn the shape of a face.
We take advantage of a face recognition network pretrained on a large-scale 2D image dataset.
Our method outperforms the state-of-the-art reconstruction methods by a large margin.
- Score: 20.782425305421505
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face reconstruction and tracking is a building block of numerous applications
in AR/VR, human-machine interaction, as well as medical applications. Most of
these applications rely on a metrically correct prediction of the shape,
especially, when the reconstructed subject is put into a metrical context
(i.e., when there is a reference object of known size). A metrical
reconstruction is also needed for any application that measures distances and
dimensions of the subject (e.g., to virtually fit a glasses frame).
State-of-the-art methods for face reconstruction from a single image are
trained on large 2D image datasets in a self-supervised fashion. However, due
to the nature of a perspective projection they are not able to reconstruct the
actual face dimensions, and even predicting the average human face outperforms
some of these methods in a metrical sense. To learn the actual shape of a face,
we argue for a supervised training scheme. Since there exists no large-scale 3D
dataset for this task, we annotated and unified small- and medium-scale
databases. The resulting unified dataset is still a medium-scale dataset with
more than 2k identities and training purely on it would lead to overfitting. To
this end, we take advantage of a face recognition network pretrained on a
large-scale 2D image dataset, which provides distinct features for different
faces and is robust to expression, illumination, and camera changes. Using
these features, we train our face shape estimator in a supervised fashion,
inheriting the robustness and generalization of the face recognition network.
Our method, which we call MICA (MetrIC fAce), outperforms the state-of-the-art
reconstruction methods by a large margin, both on current non-metric benchmarks
as well as on our metric benchmarks (15% and 24% lower average error on NoW,
respectively).
Related papers
- SPARK: Self-supervised Personalized Real-time Monocular Face Capture [6.093606972415841]
Current state of the art approaches have the ability to regress parametric 3D face models in real-time across a wide range of identities.
We propose a method for high-precision 3D face capture taking advantage of a collection of unconstrained videos of a subject as prior information.
arXiv Detail & Related papers (2024-09-12T12:30:04Z) - DICE: End-to-end Deformation Capture of Hand-Face Interactions from a Single Image [98.29284902879652]
We present DICE, the first end-to-end method for Deformation-aware hand-face Interaction reCovEry from a single image.
It features disentangling the regression of local deformation fields and global mesh locations into two network branches.
It achieves state-of-the-art performance on a standard benchmark and in-the-wild data in terms of accuracy and physical plausibility.
arXiv Detail & Related papers (2024-06-26T00:08:29Z) - GS-Pose: Category-Level Object Pose Estimation via Geometric and
Semantic Correspondence [5.500735640045456]
Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics.
We propose to utilize both geometric and semantic features obtained from a pre-trained foundation model.
This requires significantly less data to train than prior methods since the semantic features are robust to object texture and appearance.
arXiv Detail & Related papers (2023-11-23T02:35:38Z) - Zolly: Zoom Focal Length Correctly for Perspective-Distorted Human Mesh
Reconstruction [66.10717041384625]
Zolly is the first 3DHMR method focusing on perspective-distorted images.
We propose a new camera model and a novel 2D representation, termed distortion image, which describes the 2D dense distortion scale of the human body.
We extend two real-world datasets tailored for this task, all containing perspective-distorted human images.
arXiv Detail & Related papers (2023-03-24T04:22:41Z) - RAFaRe: Learning Robust and Accurate Non-parametric 3D Face
Reconstruction from Pseudo 2D&3D Pairs [13.11105614044699]
We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR)
A large-scale pseudo 2D&3D dataset is created by first rendering the detailed 3D faces, then swapping the face in the wild images with the rendered face.
Our model outperforms previous methods on FaceScape-wild/lab and MICC benchmarks.
arXiv Detail & Related papers (2023-02-10T19:40:26Z) - Facial Geometric Detail Recovery via Implicit Representation [147.07961322377685]
We present a robust texture-guided geometric detail recovery approach using only a single in-the-wild facial image.
Our method combines high-quality texture completion with the powerful expressiveness of implicit surfaces.
Our method not only recovers accurate facial details but also decomposes normals, albedos, and shading parts in a self-supervised way.
arXiv Detail & Related papers (2022-03-18T01:42:59Z) - Weakly-Supervised Multi-Face 3D Reconstruction [45.864415499303405]
We propose an effective end-to-end framework for multi-face 3D reconstruction.
We employ the same global camera model for the reconstructed faces in each image, which makes it possible to recover the relative head positions and orientations in the 3D scene.
arXiv Detail & Related papers (2021-01-06T13:15:21Z) - Survey on 3D face reconstruction from uncalibrated images [3.004265855622696]
Despite providing a more accurate representation of the face, 3D facial images are more complex to acquire than 2D pictures.
The 3D-from-2D face reconstruction problem is ill-posed, thus prior knowledge is needed to restrict the solutions space.
We review 3D face reconstruction methods proposed in the last decade, focusing on those that only use 2D pictures captured under uncontrolled conditions.
arXiv Detail & Related papers (2020-11-11T12:48:11Z) - Learning 3D Face Reconstruction with a Pose Guidance Network [49.13404714366933]
We present a self-supervised learning approach to learning monocular 3D face reconstruction with a pose guidance network (PGN)
First, we unveil the bottleneck of pose estimation in prior parametric 3D face learning methods, and propose to utilize 3D face landmarks for estimating pose parameters.
With our specially designed PGN, our model can learn from both faces with fully labeled 3D landmarks and unlimited unlabeled in-the-wild face images.
arXiv Detail & Related papers (2020-10-09T06:11:17Z) - Neural Descent for Visual 3D Human Pose and Shape [67.01050349629053]
We present deep neural network methodology to reconstruct the 3d pose and shape of people, given an input RGB image.
We rely on a recently introduced, expressivefull body statistical 3d human model, GHUM, trained end-to-end.
Central to our methodology, is a learning to learn and optimize approach, referred to as HUmanNeural Descent (HUND), which avoids both second-order differentiation.
arXiv Detail & Related papers (2020-08-16T13:38:41Z) - Semi-Siamese Training for Shallow Face Learning [78.7386209619276]
We introduce a novel training method named Semi-Siamese Training (SST)
A pair of Semi-Siamese networks constitute the forward propagation structure, and the training loss is computed with an updating gallery queue.
Our method is developed without extra-dependency, thus can be flexibly integrated with the existing loss functions and network architectures.
arXiv Detail & Related papers (2020-07-16T15:20:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.