Robust Facial Landmark Detection by Cross-order Cross-semantic Deep
Network
- URL: http://arxiv.org/abs/2011.07777v1
- Date: Mon, 16 Nov 2020 08:19:26 GMT
- Title: Robust Facial Landmark Detection by Cross-order Cross-semantic Deep
Network
- Authors: Jun Wan, Zhihui Lai, Linlin Shen, Jie Zhou, Can Gao, Gang Xiao and
Xianxu Hou
- Abstract summary: We propose a cross-order cross-semantic deep network (CCDN) to boost the semantic features learning for robust facial landmark detection.
Specifically, a cross-order two-squeeze multi-excitation (CTM) module is proposed to introduce the cross-order channel correlations for more discriminative representations learning.
A novel cross-order cross-semantic (COCS) regularizer is designed to drive the network to learn cross-order cross-semantic features from different activation for facial landmark detection.
- Score: 58.843211405385205
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, convolutional neural networks (CNNs)-based facial landmark
detection methods have achieved great success. However, most of existing
CNN-based facial landmark detection methods have not attempted to activate
multiple correlated facial parts and learn different semantic features from
them that they can not accurately model the relationships among the local
details and can not fully explore more discriminative and fine semantic
features, thus they suffer from partial occlusions and large pose variations.
To address these problems, we propose a cross-order cross-semantic deep network
(CCDN) to boost the semantic features learning for robust facial landmark
detection. Specifically, a cross-order two-squeeze multi-excitation (CTM)
module is proposed to introduce the cross-order channel correlations for more
discriminative representations learning and multiple attention-specific part
activation. Moreover, a novel cross-order cross-semantic (COCS) regularizer is
designed to drive the network to learn cross-order cross-semantic features from
different activation for facial landmark detection. It is interesting to show
that by integrating the CTM module and COCS regularizer, the proposed CCDN can
effectively activate and learn more fine and complementary cross-order
cross-semantic features to improve the accuracy of facial landmark detection
under extremely challenging scenarios. Experimental results on challenging
benchmark datasets demonstrate the superiority of our CCDN over
state-of-the-art facial landmark detection methods.
Related papers
- UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Improving Vision Anomaly Detection with the Guidance of Language
Modality [64.53005837237754]
This paper tackles the challenges for vision modality from a multimodal point of view.
We propose Cross-modal Guidance (CMG) to tackle the redundant information issue and sparse space issue.
To learn a more compact latent space for the vision anomaly detector, CMLE learns a correlation structure matrix from the language modality.
arXiv Detail & Related papers (2023-10-04T13:44:56Z) - COMICS: End-to-end Bi-grained Contrastive Learning for Multi-face Forgery Detection [56.7599217711363]
Face forgery recognition methods can only process one face at a time.
Most face forgery recognition methods can only process one face at a time.
We propose COMICS, an end-to-end framework for multi-face forgery detection.
arXiv Detail & Related papers (2023-08-03T03:37:13Z) - MC-LCR: Multi-modal contrastive classification by locally correlated
representations for effective face forgery detection [11.124150983521158]
We propose a novel framework named Multi-modal Contrastive Classification by Locally Correlated Representations.
Our MC-LCR aims to amplify implicit local discrepancies between authentic and forged faces from both spatial and frequency domains.
We achieve state-of-the-art performance and demonstrate the robustness and generalization of our method.
arXiv Detail & Related papers (2021-10-07T09:24:12Z) - MD-CSDNetwork: Multi-Domain Cross Stitched Network for Deepfake
Detection [80.83725644958633]
Current deepfake generation methods leave discriminative artifacts in the frequency spectrum of fake images and videos.
We present a novel approach, termed as MD-CSDNetwork, for combining the features in the spatial and frequency domains to mine a shared discriminative representation.
arXiv Detail & Related papers (2021-09-15T14:11:53Z) - Distract Your Attention: Multi-head Cross Attention Network for Facial
Expression Recognition [4.500212131331687]
We present a novel facial expression recognition network, called Distract your Attention Network (DAN)
Our method is based on two key observations. Multiple classes share inherently similar underlying facial appearance, and their differences could be subtle.
We propose our DAN with three key components: Feature Clustering Network (FCN), Multi-head cross Attention Network (MAN), and Attention Fusion Network (AFN)
arXiv Detail & Related papers (2021-09-15T13:15:54Z) - Hierarchical Deep CNN Feature Set-Based Representation Learning for
Robust Cross-Resolution Face Recognition [59.29808528182607]
Cross-resolution face recognition (CRFR) is important in intelligent surveillance and biometric forensics.
Existing shallow learning-based and deep learning-based methods focus on mapping the HR-LR face pairs into a joint feature space.
In this study, we desire to fully exploit the multi-level deep convolutional neural network (CNN) feature set for robust CRFR.
arXiv Detail & Related papers (2021-03-25T14:03:42Z) - Robust Facial Landmark Detection by Multi-order Multi-constraint Deep
Networks [35.19368350816032]
We propose a Multi-order Multi-constraint Deep Network (MMDN) for more powerful feature correlations and shape constraints learning.
An Implicit Multi-order Correlating Geometry-aware (IMCG) model is proposed to introduce the multi-order spatial correlations and multi-order channel correlations.
An Explicit Probability-based Boundary-adaptive Regression (EPBR) method is developed to enhance the global shape constraints.
arXiv Detail & Related papers (2020-12-09T09:11:47Z) - Cross-Correlated Attention Networks for Person Re-Identification [34.84287025161801]
We propose a new attention module called Cross-Correlated Attention (CCA)
CCA aims to overcome such limitations by maximizing the information gain between different attended regions.
We also propose a novel deep network that makes use of different attention mechanisms to learn robust and discriminative representations of person images.
arXiv Detail & Related papers (2020-06-17T01:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.