A lightweight 3D dense facial landmark estimation model from position
map data
- URL: http://arxiv.org/abs/2308.15170v1
- Date: Tue, 29 Aug 2023 09:53:10 GMT
- Title: A lightweight 3D dense facial landmark estimation model from position
map data
- Authors: Shubhajit Basak, Sathish Mangapuram, Gabriel Costache, Rachel
McDonnell, Michael Schukat
- Abstract summary: We propose a pipeline to create a dense keypoint training dataset containing 520 key points across the whole face.
We train a lightweight MobileNet-based regressor model with the generated data.
Experimental results show that our trained model outperforms many of the existing methods in spite of its lower model size and minimal computational cost.
- Score: 0.8508775813669867
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The incorporation of 3D data in facial analysis tasks has gained popularity
in recent years. Though it provides a more accurate and detailed representation
of the human face, accruing 3D face data is more complex and expensive than 2D
face images. Either one has to rely on expensive 3D scanners or depth sensors
which are prone to noise. An alternative option is the reconstruction of 3D
faces from uncalibrated 2D images in an unsupervised way without any ground
truth 3D data. However, such approaches are computationally expensive and the
learned model size is not suitable for mobile or other edge device
applications. Predicting dense 3D landmarks over the whole face can overcome
this issue. As there is no public dataset available containing dense landmarks,
we propose a pipeline to create a dense keypoint training dataset containing
520 key points across the whole face from an existing facial position map data.
We train a lightweight MobileNet-based regressor model with the generated data.
As we do not have access to any evaluation dataset with dense landmarks in it
we evaluate our model against the 68 keypoint detection task. Experimental
results show that our trained model outperforms many of the existing methods in
spite of its lower model size and minimal computational cost. Also, the
qualitative evaluation shows the efficiency of our trained models in extreme
head pose angles as well as other facial variations and occlusions.
Related papers
- MVSDet: Multi-View Indoor 3D Object Detection via Efficient Plane Sweeps [51.44887282336391]
Key challenge of multi-view indoor 3D object detection is to infer accurate geometry information from images for precise 3D detection.
Previous method relies on NeRF for geometry reasoning.
We propose MVSDet which utilizes plane sweep for geometry-aware 3D object detection.
arXiv Detail & Related papers (2024-10-28T21:58:41Z) - FAMOUS: High-Fidelity Monocular 3D Human Digitization Using View Synthesis [51.193297565630886]
The challenge of accurately inferring texture remains, particularly in obscured areas such as the back of a person in frontal-view images.
This limitation in texture prediction largely stems from the scarcity of large-scale and diverse 3D datasets.
We propose leveraging extensive 2D fashion datasets to enhance both texture and shape prediction in 3D human digitization.
arXiv Detail & Related papers (2024-10-13T01:25:05Z) - RAFaRe: Learning Robust and Accurate Non-parametric 3D Face
Reconstruction from Pseudo 2D&3D Pairs [13.11105614044699]
We propose a robust and accurate non-parametric method for single-view 3D face reconstruction (SVFR)
A large-scale pseudo 2D&3D dataset is created by first rendering the detailed 3D faces, then swapping the face in the wild images with the rendered face.
Our model outperforms previous methods on FaceScape-wild/lab and MICC benchmarks.
arXiv Detail & Related papers (2023-02-10T19:40:26Z) - Gait Recognition in the Wild with Dense 3D Representations and A
Benchmark [86.68648536257588]
Existing studies for gait recognition are dominated by 2D representations like the silhouette or skeleton of the human body in constrained scenes.
This paper aims to explore dense 3D representations for gait recognition in the wild.
We build the first large-scale 3D representation-based gait recognition dataset, named Gait3D.
arXiv Detail & Related papers (2022-04-06T03:54:06Z) - Learning Dense Correspondence from Synthetic Environments [27.841736037738286]
Existing methods map manually labelled human pixels in real 2D images onto the 3D surface, which is prone to human error.
We propose to solve the problem of data scarcity by training 2D-3D human mapping algorithms using automatically generated synthetic data.
arXiv Detail & Related papers (2022-03-24T08:13:26Z) - FaceScape: 3D Facial Dataset and Benchmark for Single-View 3D Face
Reconstruction [29.920622006999732]
We present a large-scale detailed 3D face dataset, FaceScape, and the corresponding benchmark to evaluate single-view facial 3D reconstruction.
By training on FaceScape data, a novel algorithm is proposed to predict elaborate riggable 3D face models from a single image input.
We also use FaceScape data to generate the in-the-wild and in-the-lab benchmark to evaluate recent methods of single-view face reconstruction.
arXiv Detail & Related papers (2021-11-01T16:48:34Z) - Towards Generalization of 3D Human Pose Estimation In The Wild [73.19542580408971]
3DBodyTex.Pose is a dataset that addresses the task of 3D human pose estimation in-the-wild.
3DBodyTex.Pose offers high quality and rich data containing 405 different real subjects in various clothing and poses, and 81k image samples with ground-truth 2D and 3D pose annotations.
arXiv Detail & Related papers (2020-04-21T13:31:58Z) - Multi-Person Absolute 3D Human Pose Estimation with Weak Depth
Supervision [0.0]
We introduce a network that can be trained with additional RGB-D images in a weakly supervised fashion.
Our algorithm is a monocular, multi-person, absolute pose estimator.
We evaluate the algorithm on several benchmarks, showing a consistent improvement in error rates.
arXiv Detail & Related papers (2020-04-08T13:29:22Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z) - HandVoxNet: Deep Voxel-Based Network for 3D Hand Shape and Pose
Estimation from a Single Depth Map [72.93634777578336]
We propose a novel architecture with 3D convolutions trained in a weakly-supervised manner.
The proposed approach improves over the state of the art by 47.8% on the SynHand5M dataset.
Our method produces visually more reasonable and realistic hand shapes on NYU and BigHand2.2M datasets.
arXiv Detail & Related papers (2020-04-03T14:27:16Z) - FaceScape: a Large-scale High Quality 3D Face Dataset and Detailed
Riggable 3D Face Prediction [39.95272819738226]
We present a novel algorithm that is able to predict elaborate riggable 3D face models from a single image input.
FaceScape dataset provides 18,760 textured 3D faces, captured from 938 subjects and each with 20 specific expressions.
arXiv Detail & Related papers (2020-03-31T07:11:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.