Accurate 3D Facial Geometry Prediction by Multi-Task, Multi-Modal, and
Multi-Representation Landmark Refinement Network
- URL: http://arxiv.org/abs/2104.08403v1
- Date: Fri, 16 Apr 2021 23:22:41 GMT
- Title: Accurate 3D Facial Geometry Prediction by Multi-Task, Multi-Modal, and
Multi-Representation Landmark Refinement Network
- Authors: Cho-Ying Wu, Qiangeng Xu, Ulrich Neumann
- Abstract summary: This work focuses on complete 3D facial geometry prediction, including 3D facial alignment via 3D modeling and face orientation estimation.
Our focus is on the important facial attributes, 3D landmarks, and we fully utilize their embedded information to guide 3D facial geometry learning.
We attain the state of the art from extensive experiments on all tasks of learning 3D facial geometry.
- Score: 14.966695101335704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This work focuses on complete 3D facial geometry prediction, including 3D
facial alignment via 3D face modeling and face orientation estimation using the
proposed multi-task, multi-modal, and multi-representation landmark refinement
network (M$^3$-LRN). Our focus is on the important facial attributes, 3D
landmarks, and we fully utilize their embedded information to guide 3D facial
geometry learning. We first propose a multi-modal and multi-representation
feature aggregation for landmark refinement. Next, we are the first to study
3DMM regression from sparse 3D landmarks and utilize multi-representation
advantage to attain better geometry prediction. We attain the state of the art
from extensive experiments on all tasks of learning 3D facial geometry. We
closely validate contributions of each modality and representation. Our results
are robust across cropped faces, underwater scenarios, and extreme poses.
Specially we adopt only simple and widely used network operations in M$^3$-LRN
and attain a near 20\% improvement on face orientation estimation over the
current best performance. See our project page here.
Related papers
- GeoLRM: Geometry-Aware Large Reconstruction Model for High-Quality 3D Gaussian Generation [65.33726478659304]
We introduce the Geometry-Aware Large Reconstruction Model (GeoLRM), an approach which can predict high-quality assets with 512k Gaussians and 21 input images in only 11 GB GPU memory.
Previous works neglect the inherent sparsity of 3D structure and do not utilize explicit geometric relationships between 3D and 2D images.
GeoLRM tackles these issues by incorporating a novel 3D-aware transformer structure that directly processes 3D points and uses deformable cross-attention mechanisms.
arXiv Detail & Related papers (2024-06-21T17:49:31Z) - Robust 3D Face Alignment with Multi-Path Neural Architecture Search [23.432737053236096]
3D face alignment is a very challenging and fundamental problem in computer vision.
Existing deep learning-based methods manually design different networks to regress either parameters of a 3D face model or 3D positions of face vertices.
We employ Neural Architecture Search (NAS) to automatically discover the optimal architecture for 3D face alignment.
arXiv Detail & Related papers (2024-06-12T05:02:16Z) - PointMCD: Boosting Deep Point Cloud Encoders via Multi-view Cross-modal
Distillation for 3D Shape Recognition [55.38462937452363]
We propose a unified multi-view cross-modal distillation architecture, including a pretrained deep image encoder as the teacher and a deep point encoder as the student.
By pair-wise aligning multi-view visual and geometric descriptors, we can obtain more powerful deep point encoders without exhausting and complicated network modification.
arXiv Detail & Related papers (2022-07-07T07:23:20Z) - Beyond 3DMM: Learning to Capture High-fidelity 3D Face Shape [77.95154911528365]
3D Morphable Model (3DMM) fitting has widely benefited face analysis due to its strong 3D priori.
Previous reconstructed 3D faces suffer from degraded visual verisimilitude due to the loss of fine-grained geometry.
This paper proposes a complete solution to capture the personalized shape so that the reconstructed shape looks identical to the corresponding person.
arXiv Detail & Related papers (2022-04-09T03:46:18Z) - Synergy between 3DMM and 3D Landmarks for Accurate 3D Facial Geometry [21.051258644469268]
This work studies learning from a synergy process of 3D Morphable Models (3DMM) and 3D facial landmarks.
We predict complete 3D facial geometry, including 3D alignment, face orientation, and 3D face modeling.
arXiv Detail & Related papers (2021-10-19T07:29:14Z) - Topologically Consistent Multi-View Face Inference Using Volumetric
Sampling [25.001398662643986]
ToFu is a geometry inference framework that can produce topologically consistent meshes across identities and expressions.
A novel progressive mesh generation network embeds the topological structure of the face in a feature volume.
These high-quality assets are readily usable by production studios for avatar creation, animation and physically-based skin rendering.
arXiv Detail & Related papers (2021-10-06T17:55:08Z) - Weakly-Supervised Multi-Face 3D Reconstruction [45.864415499303405]
We propose an effective end-to-end framework for multi-face 3D reconstruction.
We employ the same global camera model for the reconstructed faces in each image, which makes it possible to recover the relative head positions and orientations in the 3D scene.
arXiv Detail & Related papers (2021-01-06T13:15:21Z) - Learning 3D Face Reconstruction with a Pose Guidance Network [49.13404714366933]
We present a self-supervised learning approach to learning monocular 3D face reconstruction with a pose guidance network (PGN)
First, we unveil the bottleneck of pose estimation in prior parametric 3D face learning methods, and propose to utilize 3D face landmarks for estimating pose parameters.
With our specially designed PGN, our model can learn from both faces with fully labeled 3D landmarks and unlimited unlabeled in-the-wild face images.
arXiv Detail & Related papers (2020-10-09T06:11:17Z) - Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images [64.53227129573293]
We investigate the problem of learning to generate 3D parametric surface representations for novel object instances, as seen from one or more views.
We design neural networks capable of generating high-quality parametric 3D surfaces which are consistent between views.
Our method is supervised and trained on a public dataset of shapes from common object categories.
arXiv Detail & Related papers (2020-08-18T06:33:40Z) - Unsupervised Cross-Modal Alignment for Multi-Person 3D Pose Estimation [52.94078950641959]
We present a deployment friendly, fast bottom-up framework for multi-person 3D human pose estimation.
We adopt a novel neural representation of multi-person 3D pose which unifies the position of person instances with their corresponding 3D pose representation.
We propose a practical deployment paradigm where paired 2D or 3D pose annotations are unavailable.
arXiv Detail & Related papers (2020-08-04T07:54:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.