Robust Facial Landmark Detection by Multi-order Multi-constraint Deep
Networks
- URL: http://arxiv.org/abs/2012.04927v2
- Date: Thu, 10 Dec 2020 23:39:38 GMT
- Title: Robust Facial Landmark Detection by Multi-order Multi-constraint Deep
Networks
- Authors: Jun Wan, Zhihui Lai, Jing Li, Jie Zhou, Can Gao
- Abstract summary: We propose a Multi-order Multi-constraint Deep Network (MMDN) for more powerful feature correlations and shape constraints learning.
An Implicit Multi-order Correlating Geometry-aware (IMCG) model is proposed to introduce the multi-order spatial correlations and multi-order channel correlations.
An Explicit Probability-based Boundary-adaptive Regression (EPBR) method is developed to enhance the global shape constraints.
- Score: 35.19368350816032
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, heatmap regression has been widely explored in facial landmark
detection and obtained remarkable performance. However, most of the existing
heatmap regression-based facial landmark detection methods neglect to explore
the high-order feature correlations, which is very important to learn more
representative features and enhance shape constraints. Moreover, no explicit
global shape constraints have been added to the final predicted landmarks,
which leads to a reduction in accuracy. To address these issues, in this paper,
we propose a Multi-order Multi-constraint Deep Network (MMDN) for more powerful
feature correlations and shape constraints learning. Specifically, an Implicit
Multi-order Correlating Geometry-aware (IMCG) model is proposed to introduce
the multi-order spatial correlations and multi-order channel correlations for
more discriminative representations. Furthermore, an Explicit Probability-based
Boundary-adaptive Regression (EPBR) method is developed to enhance the global
shape constraints and further search the semantically consistent landmarks in
the predicted boundary for robust facial landmark detection. It's interesting
to show that the proposed MMDN can generate more accurate boundary-adaptive
landmark heatmaps and effectively enhance shape constraints to the predicted
landmarks for faces with large pose variations and heavy occlusions.
Experimental results on challenging benchmark datasets demonstrate the
superiority of our MMDN over state-of-the-art facial landmark detection
methods. The code has been publicly available at
https://github.com/junwan2014/MMDN-master.
Related papers
- Fine-grained Dynamic Network for Generic Event Boundary Detection [9.17191007695011]
We propose a novel dynamic pipeline for generic event boundaries named DyBDet.
By introducing a multi-exit network architecture, DyBDet automatically learns the allocation to different video snippets.
Experiments on the challenging Kinetics-GEBD and TAPOS datasets demonstrate that adopting the dynamic strategy significantly benefits GEBD tasks.
arXiv Detail & Related papers (2024-07-05T06:02:46Z) - Temporal Action Localization with Enhanced Instant Discriminability [66.76095239972094]
Temporal action detection (TAD) aims to detect all action boundaries and their corresponding categories in an untrimmed video.
We propose a one-stage framework named TriDet to resolve imprecise predictions of action boundaries by existing methods.
Experimental results demonstrate the robustness of TriDet and its state-of-the-art performance on multiple TAD datasets.
arXiv Detail & Related papers (2023-09-11T16:17:50Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection [56.7599217711363]
Face forgery recognition methods can only process one face at a time.
Most face forgery recognition methods can only process one face at a time.
We propose COMICS, an end-to-end framework for multi-face forgery detection.
arXiv Detail & Related papers (2023-08-03T03:37:13Z) - HiDAnet: RGB-D Salient Object Detection via Hierarchical Depth Awareness [2.341385717236931]
We propose a novel Hierarchical Depth Awareness network (HiDAnet) for RGB-D saliency detection.
Our motivation comes from the observation that the multi-granularity properties of geometric priors correlate well with the neural network hierarchies.
Our HiDAnet performs favorably over the state-of-the-art methods by large margins.
arXiv Detail & Related papers (2023-01-18T10:00:59Z) - MRDet: A Multi-Head Network for Accurate Oriented Object Detection in
Aerial Images [51.227489316673484]
We propose an arbitrary-oriented region proposal network (AO-RPN) to generate oriented proposals transformed from horizontal anchors.
To obtain accurate bounding boxes, we decouple the detection task into multiple subtasks and propose a multi-head network.
Each head is specially designed to learn the features optimal for the corresponding task, which allows our network to detect objects accurately.
arXiv Detail & Related papers (2020-12-24T06:36:48Z) - Robust Facial Landmark Detection by Cross-order Cross-semantic Deep
Network [58.843211405385205]
We propose a cross-order cross-semantic deep network (CCDN) to boost the semantic features learning for robust facial landmark detection.
Specifically, a cross-order two-squeeze multi-excitation (CTM) module is proposed to introduce the cross-order channel correlations for more discriminative representations learning.
A novel cross-order cross-semantic (COCS) regularizer is designed to drive the network to learn cross-order cross-semantic features from different activation for facial landmark detection.
arXiv Detail & Related papers (2020-11-16T08:19:26Z) - Robust Face Alignment by Multi-order High-precision Hourglass Network [44.94500006611075]
This paper proposes a heatmap subpixel regression (HSR) method and a multi-order cross geometry-aware (MCG) model.
The HSR method is proposed to achieve high-precision landmark detection by a well-designed subpixel detection loss (SDL) and subpixel detection technology (SDT)
At the same time, the MCG model is able to use the proposed multi-order cross information to learn more discriminative representations for enhancing facial geometric constraints and context information.
arXiv Detail & Related papers (2020-10-17T05:40:30Z) - Think about boundary: Fusing multi-level boundary information for
landmark heatmap regression [51.48533538153833]
We study a two-stage but end-to-end approach for exploring the relationship between the facial boundary and landmarks.
We get boundary-aware landmark predictions, which consists of two modules: the self-calibrated boundary estimation (SCBE) module and the boundary-aware landmark transform (BALT) module.
Our approach outperforms state-of-the-art methods in the literature.
arXiv Detail & Related papers (2020-08-25T10:14:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.