Related papers: J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face Alignment via Adaptive Attention

J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face Alignment via Adaptive Attention

URL: http://arxiv.org/abs/2003.08834v3
Date: Thu, 24 Sep 2020 03:08:58 GMT
Title: J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face Alignment via Adaptive Attention
Authors: Zhiwen Shao, Zhilei Liu, Jianfei Cai, Lizhuang Ma
Abstract summary: We propose a novel end-to-end deep learning framework for joint AU detection and face alignment. Our framework significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks.
Score: 57.51255553918323
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Facial action unit (AU) detection and face alignment are two highly correlated tasks, since facial landmarks can provide precise AU locations to facilitate the extraction of meaningful local features for AU detection. However, most existing AU detection works handle the two tasks independently by treating face alignment as a preprocessing, and often use landmarks to predefine a fixed region or attention for each AU. In this paper, we propose a novel end-to-end deep learning framework for joint AU detection and face alignment, which has not been explored before. In particular, multi-scale shared feature is learned firstly, and high-level feature of face alignment is fed into AU detection. Moreover, to extract precise local features, we propose an adaptive attention learning module to refine the attention map of each AU adaptively. Finally, the assembled local features are integrated with face alignment feature and global feature for AU detection. Extensive experiments demonstrate that our framework (i) significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks, (ii) can adaptively capture the irregular region of each AU, (iii) achieves competitive performance for face alignment, and (iv) also works well under partial occlusions and non-frontal poses. The code for our method is available at https://github.com/ZhiwenShao/PyTorch-JAANet.

Related papers

Decoupled Doubly Contrastive Learning for Cross Domain Facial Action Unit Detection [66.80386429324196]
We propose a decoupled doubly contrastive adaptation (D$2$CA) approach to learn a purified AU representation. D$2$CA is trained to disentangle AU and domain factors by assessing the quality of synthesized faces. It consistently outperforms state-of-the-art cross-domain AU detection approaches.
arXiv Detail & Related papers (2025-03-12T00:42:17Z)
Facial Action Unit Detection by Adaptively Constraining Self-Attention and Causally Deconfounding Sample [53.23474626420103]
Facial action unit (AU) detection remains a challenging task, due to the subtlety, dynamics, and diversity of AUs. We propose a novel AU detection framework called AC2D by adaptively constraining self-attention weight distribution. Our method achieves competitive performance compared to state-of-the-art AU detection approaches on challenging benchmarks.
arXiv Detail & Related papers (2024-10-02T05:51:24Z)
UniForensics: Face Forgery Detection via General Facial Representation [60.5421627990707]
High-level semantic features are less susceptible to perturbations and not limited to forgery-specific artifacts, thus having stronger generalization. We introduce UniForensics, a novel deepfake detection framework that leverages a transformer-based video network, with a meta-functional face classification for enriched facial representation.
arXiv Detail & Related papers (2024-07-26T20:51:54Z)
Self-supervised Facial Action Unit Detection with Region and Relation Learning [5.182661263082065]
We propose a novel self-supervised framework for AU detection with the region and relation learning. An improved Optimal Transport (OT) algorithm is introduced to exploit the correlation characteristics among AUs. Swin Transformer is exploited to model the long-distance dependencies within each AU region during feature learning.
arXiv Detail & Related papers (2023-03-10T05:22:45Z)
Centralized Feature Pyramid for Object Detection [53.501796194901964]
Visual feature pyramid has shown its superiority in both effectiveness and efficiency in a wide range of applications. In this paper, we propose a OLO Feature Pyramid for object detection, which is based on a globally explicit centralized feature regulation.
arXiv Detail & Related papers (2022-10-05T08:32:54Z)
MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection [16.261362598190807]
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images. We argue that encoding AU features just from one perspective may not capture the rich contextual information between regional and global face features. We propose a novel Multi-level Graph Reasoning Network (termed MGRR-Net) for facial AU detection.
arXiv Detail & Related papers (2022-04-04T09:47:22Z)
Adaptive Local-Global Relational Network for Facial Action Units Recognition and Facial Paralysis Estimation [22.85506776477092]
We propose a novel Adaptive Local-Global Network (ALGRNet) for facial AU recognition and apply it to facial paralysis estimation. ALGRNet consists of three novel structures, i.e., an adaptive region learning module which learns the adaptive muscle regions based on detected landmarks. Experiments on the BP4 and DISFA AU datasets show that the proposed approach outperforms the state-of-the-art methods by a large margin.
arXiv Detail & Related papers (2022-03-03T16:14:49Z)
Point-Level Region Contrast for Object Detection Pre-Training [147.47349344401806]
We present point-level region contrast, a self-supervised pre-training approach for the task of object detection. Our approach performs contrastive learning by directly sampling individual point pairs from different regions. Compared to an aggregated representation per region, our approach is more robust to the change in input region quality.
arXiv Detail & Related papers (2022-02-09T18:56:41Z)
Action Unit Detection with Joint Adaptive Attention and Graph Relation [3.98807633060402]
We present our submission to the Field Affective Behavior Analysis (ABAW) 2021 competition. The proposed method uses the pre-trained JAA model as the feature extractor. Our model achieves 0.674 on the challenging Aff-Wild2 database.
arXiv Detail & Related papers (2021-07-09T12:33:38Z)
Meta Auxiliary Learning for Facial Action Unit Detection [84.22521265124806]
We consider learning AU detection and facial expression recognition in a multi-task manner. The performance of the AU detection task cannot be always enhanced due to the negative transfer in the multi-task scenario. We propose a Meta Learning method (MAL) that automatically selects highly related FE samples by learning adaptative weights for the training FE samples in a meta learning manner.
arXiv Detail & Related papers (2021-05-14T02:28:40Z)
HiCOMEX: Facial Action Unit Recognition Based on Hierarchy Intensity Distribution and COMEX Relation Learning [12.450173086494944]
We propose a novel framework for the AU detection from a single input image. Our algorithm uses facial landmarks to detect the features of local AUs. Our experiments on the challenging BP4D and DISFA benchmarks yield F1-scores of 63.7% and 61.8% respectively.
arXiv Detail & Related papers (2020-09-23T01:49:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.