A Backbone Replaceable Fine-tuning Framework for Stable Face Alignment
- URL: http://arxiv.org/abs/2010.09501v2
- Date: Fri, 13 Nov 2020 02:33:01 GMT
- Title: A Backbone Replaceable Fine-tuning Framework for Stable Face Alignment
- Authors: Xu Sun, Zhenfeng Fan, Zihao Zhang, Yingjie Guo, Shihong Xia
- Abstract summary: We propose a Jitter loss function that leverages temporal information to suppress inaccurate as well as jittered landmarks.
The proposed framework achieves at least 40% improvement on stability evaluation metrics.
It can swiftly convert a landmark detector for facial images to a better-performing one for videos without retraining the entire model.
- Score: 21.696696531924374
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Heatmap regression based face alignment has achieved prominent performance on
static images. However, the stability and accuracy are remarkably discounted
when applying the existing methods on dynamic videos. We attribute the
degradation to random noise and motion blur, which are common in videos. The
temporal information is critical to address this issue yet not fully considered
in the existing works. In this paper, we visit the video-oriented face
alignment problem in two perspectives: detection accuracy prefers lower error
for a single frame, and detection consistency forces better stability between
adjacent frames. On this basis, we propose a Jitter loss function that
leverages temporal information to suppress inaccurate as well as jittered
landmarks. The Jitter loss is involved in a novel framework with a fine-tuning
ConvLSTM structure over a backbone replaceable network. We further demonstrate
that accurate and stable landmarks are associated with different regions with
overlaps in a canonical coordinate, based on which the proposed Jitter loss
facilitates the optimization process during training. The proposed framework
achieves at least 40% improvement on stability evaluation metrics while
enhancing detection accuracy versus state-of-the-art methods. Generally, it can
swiftly convert a landmark detector for facial images to a better-performing
one for videos without retraining the entire model.
Related papers
- Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - RIGID: Recurrent GAN Inversion and Editing of Real Face Videos [73.97520691413006]
GAN inversion is indispensable for applying the powerful editability of GAN to real images.
Existing methods invert video frames individually often leading to undesired inconsistent results over time.
We propose a unified recurrent framework, named textbfRecurrent vtextbfIdeo textbfGAN textbfInversion and etextbfDiting (RIGID)
Our framework learns the inherent coherence between input frames in an end-to-end manner.
arXiv Detail & Related papers (2023-08-11T12:17:24Z) - Fast Full-frame Video Stabilization with Iterative Optimization [21.962533235492625]
We propose an iterative optimization-based learning approach using synthetic datasets for video stabilization.
We develop a two-level (coarse-to-fine) stabilizing algorithm based on the probabilistic flow field.
We take a divide-and-conquer approach and propose a novel multiframe fusion strategy to render full-frame stabilized views.
arXiv Detail & Related papers (2023-07-24T13:24:19Z) - GPU-accelerated SIFT-aided source identification of stabilized videos [63.084540168532065]
We exploit the parallelization capabilities of Graphics Processing Units (GPUs) in the framework of stabilised frames inversion.
We propose to exploit SIFT features.
to estimate the camera momentum and %to identify less stabilized temporal segments.
Experiments confirm the effectiveness of the proposed approach in reducing the required computational time and improving the source identification accuracy.
arXiv Detail & Related papers (2022-07-29T07:01:31Z) - Temporal Feature Alignment and Mutual Information Maximization for
Video-Based Human Pose Estimation [38.571715193347366]
We present a novel hierarchical alignment framework for multi-frame human pose estimation.
We rank No.1 in the Multi-frame Person Pose Estimation Challenge on benchmark dataset PoseTrack 2017, and obtain state-of-the-art performance on benchmarks Sub-JHMDB and Pose-Track 2018.
arXiv Detail & Related papers (2022-03-29T04:29:16Z) - AuxAdapt: Stable and Efficient Test-Time Adaptation for Temporally
Consistent Video Semantic Segmentation [81.87943324048756]
In video segmentation, generating temporally consistent results across frames is as important as achieving frame-wise accuracy.
Existing methods rely on optical flow regularization or fine-tuning with test data to attain temporal consistency.
This paper presents an efficient, intuitive, and unsupervised online adaptation method, AuxAdapt, for improving the temporal consistency of most neural network models.
arXiv Detail & Related papers (2021-10-24T07:07:41Z) - TimeLens: Event-based Video Frame Interpolation [54.28139783383213]
We introduce Time Lens, a novel indicates equal contribution method that leverages the advantages of both synthesis-based and flow-based approaches.
We show an up to 5.21 dB improvement in terms of PSNR over state-of-the-art frame-based and event-based methods.
arXiv Detail & Related papers (2021-06-14T10:33:47Z) - Towards Fast, Accurate and Stable 3D Dense Face Alignment [73.01620081047336]
We propose a novel regression framework named 3DDFA-V2 which makes a balance among speed, accuracy and stability.
We present a virtual synthesis method to transform one still image to a short-video which incorporates in-plane and out-of-plane face moving.
arXiv Detail & Related papers (2020-09-21T15:37:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.