J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face
Alignment via Adaptive Attention
- URL: http://arxiv.org/abs/2003.08834v3
- Date: Thu, 24 Sep 2020 03:08:58 GMT
- Title: J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face
Alignment via Adaptive Attention
- Authors: Zhiwen Shao, Zhilei Liu, Jianfei Cai, Lizhuang Ma
- Abstract summary: We propose a novel end-to-end deep learning framework for joint AU detection and face alignment.
Our framework significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks.
- Score: 57.51255553918323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial action unit (AU) detection and face alignment are two highly
correlated tasks, since facial landmarks can provide precise AU locations to
facilitate the extraction of meaningful local features for AU detection.
However, most existing AU detection works handle the two tasks independently by
treating face alignment as a preprocessing, and often use landmarks to
predefine a fixed region or attention for each AU. In this paper, we propose a
novel end-to-end deep learning framework for joint AU detection and face
alignment, which has not been explored before. In particular, multi-scale
shared feature is learned firstly, and high-level feature of face alignment is
fed into AU detection. Moreover, to extract precise local features, we propose
an adaptive attention learning module to refine the attention map of each AU
adaptively. Finally, the assembled local features are integrated with face
alignment feature and global feature for AU detection. Extensive experiments
demonstrate that our framework (i) significantly outperforms the
state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and
BP4D+ benchmarks, (ii) can adaptively capture the irregular region of each AU,
(iii) achieves competitive performance for face alignment, and (iv) also works
well under partial occlusions and non-frontal poses. The code for our method is
available at https://github.com/ZhiwenShao/PyTorch-JAANet.
Related papers
- Facial Action Unit Detection by Adaptively Constraining Self-Attention and Causally Deconfounding Sample [53.23474626420103]
Facial action unit (AU) detection remains a challenging task, due to the subtlety, dynamics, and diversity of AUs.
We propose a novel AU detection framework called AC2D by adaptively constraining self-attention weight distribution.
Our method achieves competitive performance compared to state-of-the-art AU detection approaches on challenging benchmarks.
arXiv Detail & Related papers (2024-10-02T05:51:24Z) - UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization.
We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z) - Self-supervised Facial Action Unit Detection with Region and Relation
Learning [5.182661263082065]
We propose a novel self-supervised framework for AU detection with the region and relation learning.
An improved Optimal Transport (OT) algorithm is introduced to exploit the correlation characteristics among AUs.
Swin Transformer is exploited to model the long-distance dependencies within each AU region during feature learning.
arXiv Detail & Related papers (2023-03-10T05:22:45Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection [16.261362598190807]
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images.
We argue that encoding AU features just from one perspective may not capture the rich contextual information between regional and global face features.
We propose a novel Multi-level Graph Reasoning Network (termed MGRR-Net) for facial AU detection.
arXiv Detail & Related papers (2022-04-04T09:47:22Z) - Adaptive Local-Global Relational Network for Facial Action Units
Recognition and Facial Paralysis Estimation [22.85506776477092]
We propose a novel Adaptive Local-Global Network (ALGRNet) for facial AU recognition and apply it to facial paralysis estimation.
ALGRNet consists of three novel structures, i.e., an adaptive region learning module which learns the adaptive muscle regions based on detected landmarks.
Experiments on the BP4 and DISFA AU datasets show that the proposed approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-03-03T16:14:49Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Meta Auxiliary Learning for Facial Action Unit Detection [84.22521265124806]
We consider learning AU detection and facial expression recognition in a multi-task manner.
The performance of the AU detection task cannot be always enhanced due to the negative transfer in the multi-task scenario.
We propose a Meta Learning method (MAL) that automatically selects highly related FE samples by learning adaptative weights for the training FE samples in a meta learning manner.
arXiv Detail & Related papers (2021-05-14T02:28:40Z) - HiCOMEX: Facial Action Unit Recognition Based on Hierarchy Intensity
Distribution and COMEX Relation Learning [12.450173086494944]
We propose a novel framework for the AU detection from a single input image.
Our algorithm uses facial landmarks to detect the features of local AUs.
Our experiments on the challenging BP4D and DISFA benchmarks yield F1-scores of 63.7% and 61.8% respectively.
arXiv Detail & Related papers (2020-09-23T01:49:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.