Towards Fast, Accurate and Stable 3D Dense Face Alignment
- URL: http://arxiv.org/abs/2009.09960v2
- Date: Sun, 7 Feb 2021 16:24:15 GMT
- Title: Towards Fast, Accurate and Stable 3D Dense Face Alignment
- Authors: Jianzhu Guo, Xiangyu Zhu, Yang Yang, Fan Yang, Zhen Lei and Stan Z. Li
- Abstract summary: We propose a novel regression framework named 3DDFA-V2 which makes a balance among speed, accuracy and stability.
We present a virtual synthesis method to transform one still image to a short-video which incorporates in-plane and out-of-plane face moving.
- Score: 73.01620081047336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing methods of 3D dense face alignment mainly concentrate on accuracy,
thus limiting the scope of their practical applications. In this paper, we
propose a novel regression framework named 3DDFA-V2 which makes a balance among
speed, accuracy and stability. Firstly, on the basis of a lightweight backbone,
we propose a meta-joint optimization strategy to dynamically regress a small
set of 3DMM parameters, which greatly enhances speed and accuracy
simultaneously. To further improve the stability on videos, we present a
virtual synthesis method to transform one still image to a short-video which
incorporates in-plane and out-of-plane face moving. On the premise of high
accuracy and stability, 3DDFA-V2 runs at over 50fps on a single CPU core and
outperforms other state-of-the-art heavy models simultaneously. Experiments on
several challenging datasets validate the efficiency of our method. Pre-trained
models and code are available at https://github.com/cleardusk/3DDFA_V2.
Related papers
- UPose3D: Uncertainty-Aware 3D Human Pose Estimation with Cross-View and Temporal Cues [55.69339788566899]
UPose3D is a novel approach for multi-view 3D human pose estimation.
It improves robustness and flexibility without requiring direct 3D annotations.
arXiv Detail & Related papers (2024-04-23T00:18:00Z) - Minimum Latency Deep Online Video Stabilization [77.68990069996939]
We present a novel camera path optimization framework for the task of online video stabilization.
In this work, we adopt recent off-the-shelf high-quality deep motion models for motion estimation to recover the camera trajectory.
Our approach significantly outperforms state-of-the-art online methods both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-12-05T07:37:32Z) - Fast-SNARF: A Fast Deformer for Articulated Neural Fields [92.68788512596254]
We propose a new articulation module for neural fields, Fast-SNARF, which finds accurate correspondences between canonical space and posed space.
Fast-SNARF is a drop-in replacement in to our previous work, SNARF, while significantly improving its computational efficiency.
Because learning of deformation maps is a crucial component in many 3D human avatar methods, we believe that this work represents a significant step towards the practical creation of 3D virtual humans.
arXiv Detail & Related papers (2022-11-28T17:55:34Z) - Neural Deformable Voxel Grid for Fast Optimization of Dynamic View
Synthesis [63.25919018001152]
We propose a fast deformable radiance field method to handle dynamic scenes.
Our method achieves comparable performance to D-NeRF using only 20 minutes for training.
arXiv Detail & Related papers (2022-06-15T17:49:08Z) - Learning to Fit Morphable Models [12.469605679847085]
We build upon recent advances in learned optimization and propose an update rule inspired by the classic Levenberg-Marquardt algorithm.
We show the effectiveness of the proposed neural on the problems of 3D body surface estimation from a head-mounted device and face fitting from 2D landmarks.
arXiv Detail & Related papers (2021-11-29T18:59:53Z) - Improved Pillar with Fine-grained Feature for 3D Object Detection [23.348710029787068]
3D object detection with LiDAR point clouds plays an important role in autonomous driving perception module.
Existing point-based methods are challenging to reach the speed requirements because of too many raw points.
The 2D grid-based methods, such as PointPillar, can easily achieve a stable and efficient speed based on simple 2D convolution.
arXiv Detail & Related papers (2021-10-12T14:53:14Z) - A Real-time Action Representation with Temporal Encoding and Deep
Compression [115.3739774920845]
We propose a new real-time convolutional architecture, called Temporal Convolutional 3D Network (T-C3D), for action representation.
T-C3D learns video action representations in a hierarchical multi-granularity manner while obtaining a high process speed.
Our method achieves clear improvements on UCF101 action recognition benchmark against state-of-the-art real-time methods by 5.4% in terms of accuracy and 2 times faster in terms of inference speed with a less than 5MB storage model.
arXiv Detail & Related papers (2020-06-17T06:30:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.