J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face
Alignment via Adaptive Attention
- URL: http://arxiv.org/abs/2003.08834v3
- Date: Thu, 24 Sep 2020 03:08:58 GMT
- Title: J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face
Alignment via Adaptive Attention
- Authors: Zhiwen Shao, Zhilei Liu, Jianfei Cai, Lizhuang Ma
- Abstract summary: We propose a novel end-to-end deep learning framework for joint AU detection and face alignment.
Our framework significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks.
- Score: 57.51255553918323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial action unit (AU) detection and face alignment are two highly
correlated tasks, since facial landmarks can provide precise AU locations to
facilitate the extraction of meaningful local features for AU detection.
However, most existing AU detection works handle the two tasks independently by
treating face alignment as a preprocessing, and often use landmarks to
predefine a fixed region or attention for each AU. In this paper, we propose a
novel end-to-end deep learning framework for joint AU detection and face
alignment, which has not been explored before. In particular, multi-scale
shared feature is learned firstly, and high-level feature of face alignment is
fed into AU detection. Moreover, to extract precise local features, we propose
an adaptive attention learning module to refine the attention map of each AU
adaptively. Finally, the assembled local features are integrated with face
alignment feature and global feature for AU detection. Extensive experiments
demonstrate that our framework (i) significantly outperforms the
state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and
BP4D+ benchmarks, (ii) can adaptively capture the irregular region of each AU,
(iii) achieves competitive performance for face alignment, and (iv) also works
well under partial occlusions and non-frontal poses. The code for our method is
available at https://github.com/ZhiwenShao/PyTorch-JAANet.
Related papers
- Self-supervised Facial Action Unit Detection with Region and Relation
Learning [5.182661263082065]
We propose a novel self-supervised framework for AU detection with the region and relation learning.
An improved Optimal Transport (OT) algorithm is introduced to exploit the correlation characteristics among AUs.
Swin Transformer is exploited to model the long-distance dependencies within each AU region during feature learning.
arXiv Detail & Related papers (2023-03-10T05:22:45Z) - Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications.
In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z) - MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection [16.261362598190807]
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images.
We argue that encoding AU features just from one perspective may not capture the rich contextual information between regional and global face features.
We propose a novel Multi-level Graph Reasoning Network (termed MGRR-Net) for facial AU detection.
arXiv Detail & Related papers (2022-04-04T09:47:22Z) - Adaptive Local-Global Relational Network for Facial Action Units
Recognition and Facial Paralysis Estimation [22.85506776477092]
We propose a novel Adaptive Local-Global Network (ALGRNet) for facial AU recognition and apply it to facial paralysis estimation.
ALGRNet consists of three novel structures, i.e., an adaptive region learning module which learns the adaptive muscle regions based on detected landmarks.
Experiments on the BP4 and DISFA AU datasets show that the proposed approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-03-03T16:14:49Z) - Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection.
Our approach performs contrastive learning by directly sampling individual point pairs from different regions.
Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z) - Action Unit Detection with Joint Adaptive Attention and Graph Relation [3.98807633060402]
We present our submission to the Field Affective Behavior Analysis (ABAW) 2021 competition.
The proposed method uses the pre-trained JAA model as the feature extractor.
Our model achieves 0.674 on the challenging Aff-Wild2 database.
arXiv Detail & Related papers (2021-07-09T12:33:38Z) - Meta Auxiliary Learning for Facial Action Unit Detection [84.22521265124806]
We consider learning AU detection and facial expression recognition in a multi-task manner.
The performance of the AU detection task cannot be always enhanced due to the negative transfer in the multi-task scenario.
We propose a Meta Learning method (MAL) that automatically selects highly related FE samples by learning adaptative weights for the training FE samples in a meta learning manner.
arXiv Detail & Related papers (2021-05-14T02:28:40Z) - Generalizing Face Forgery Detection with High-frequency Features [63.33397573649408]
Current CNN-based detectors tend to overfit to method-specific color textures and thus fail to generalize.
We propose to utilize the high-frequency noises for face forgery detection.
The first is the multi-scale high-frequency feature extraction module that extracts high-frequency noises at multiple scales.
The second is the residual-guided spatial attention module that guides the low-level RGB feature extractor to concentrate more on forgery traces from a new perspective.
arXiv Detail & Related papers (2021-03-23T08:19:21Z) - Goal-Oriented Gaze Estimation for Zero-Shot Learning [62.52340838817908]
We introduce a novel goal-oriented gaze estimation module (GEM) to improve the discriminative attribute localization.
We aim to predict the actual human gaze location to get the visual attention regions for recognizing a novel object guided by attribute description.
This work implies the promising benefits of collecting human gaze dataset and automatic gaze estimation algorithms on high-level computer vision tasks.
arXiv Detail & Related papers (2021-03-05T02:14:57Z) - HiCOMEX: Facial Action Unit Recognition Based on Hierarchy Intensity
Distribution and COMEX Relation Learning [12.450173086494944]
We propose a novel framework for the AU detection from a single input image.
Our algorithm uses facial landmarks to detect the features of local AUs.
Our experiments on the challenging BP4D and DISFA benchmarks yield F1-scores of 63.7% and 61.8% respectively.
arXiv Detail & Related papers (2020-09-23T01:49:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.