Adaptive Local-Global Relational Network for Facial Action Units
Recognition and Facial Paralysis Estimation
- URL: http://arxiv.org/abs/2203.01800v1
- Date: Thu, 3 Mar 2022 16:14:49 GMT
- Title: Adaptive Local-Global Relational Network for Facial Action Units
Recognition and Facial Paralysis Estimation
- Authors: Xuri Ge, Joemon M. Jose, Pengcheng Wang, Arunachalam Iyer, Xiao Liu,
and Hu Han
- Abstract summary: We propose a novel Adaptive Local-Global Network (ALGRNet) for facial AU recognition and apply it to facial paralysis estimation.
ALGRNet consists of three novel structures, i.e., an adaptive region learning module which learns the adaptive muscle regions based on detected landmarks.
Experiments on the BP4 and DISFA AU datasets show that the proposed approach outperforms the state-of-the-art methods by a large margin.
- Score: 22.85506776477092
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial action units (AUs) refer to a unique set of facial muscle movements at
certain facial locations defined by the Facial Action Coding System (FACS),
which can be used for describing nearly any anatomically possible facial
expression. Many existing facial action units (AUs) recognition approaches
often enhance the AU representation by combining local features from multiple
independent branches, each corresponding to a different AU, which usually
neglect potential mutual assistance and exclusion relationship between AU
branches or simply employ a pre-defined and fixed knowledge-graph as a prior.
In addition, extracting features from pre-defined AU regions of regular shapes
limits the representation ability. In this paper, we propose a novel Adaptive
Local-Global Relational Network (ALGRNet) for facial AU recognition and apply
it to facial paralysis estimation. ALGRNet mainly consists of three novel
structures, i.e., an adaptive region learning module which learns the adaptive
muscle regions based on the detected landmarks, a skip-BiLSTM module which
models the latent mutual assistance and exclusion relationship among local AU
features, and a feature fusion\&refining module which explores the
complementarity between local AUs and the whole face for the local AU
refinement. In order to evaluate our proposed method, we migrated ALGRNet to a
facial paralysis dataset which is collected and annotated by medical
professionals. Experiments on the BP4D and DISFA AU datasets show that the
proposed approach outperforms the state-of-the-art methods by a large margin.
Additionally, we also demonstrated the effectiveness of the proposed ALGRNet in
applications to facial paralysis estimation.
Related papers
- Towards End-to-End Explainable Facial Action Unit Recognition via Vision-Language Joint Learning [48.70249675019288]
We propose an end-to-end Vision-Language joint learning network for explainable facial action units (AUs) recognition.
The proposed approach achieves superior performance over the state-of-the-art methods on most metrics.
arXiv Detail & Related papers (2024-08-01T15:35:44Z) - Contrastive Learning of Person-independent Representations for Facial
Action Unit Detection [70.60587475492065]
We formulate the self-supervised AU representation learning signals in two-fold.
We contrast learn the AU representation within a video clip and devise a cross-identity reconstruction mechanism to learn the person-independent representations.
Our method outperforms other contrastive learning methods and significantly closes the performance gap between the self-supervised and supervised AU detection approaches.
arXiv Detail & Related papers (2024-03-06T01:49:28Z) - Local Region Perception and Relationship Learning Combined with Feature
Fusion for Facial Action Unit Detection [12.677143408225167]
We introduce our submission to the CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW)
We propose a single-stage trained AU detection framework. Specifically, in order to effectively extract facial local region features related to AU detection, we use a local region perception module.
We also use a graph neural network-based relational learning module to capture the relationship between AUs.
arXiv Detail & Related papers (2023-03-15T11:59:24Z) - MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection [16.261362598190807]
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images.
We argue that encoding AU features just from one perspective may not capture the rich contextual information between regional and global face features.
We propose a novel Multi-level Graph Reasoning Network (termed MGRR-Net) for facial AU detection.
arXiv Detail & Related papers (2022-04-04T09:47:22Z) - Weakly Supervised Regional and Temporal Learning for Facial Action Unit
Recognition [36.350407471391065]
We propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance.
A single image based optical flow estimation task is proposed to leverage the dynamic change of facial muscles.
By incorporating semi-supervised learning, we propose an end-to-end trainable framework named weakly supervised regional and temporal learning.
arXiv Detail & Related papers (2022-04-01T12:02:01Z) - Multi-Branch Deep Radial Basis Function Networks for Facial Emotion
Recognition [80.35852245488043]
We propose a CNN based architecture enhanced with multiple branches formed by radial basis function (RBF) units.
RBF units capture local patterns shared by similar instances using an intermediate representation.
We show it is the incorporation of local information what makes the proposed model competitive.
arXiv Detail & Related papers (2021-09-07T21:05:56Z) - Self-Supervised Regional and Temporal Auxiliary Tasks for Facial Action
Unit Recognition [29.664359264758495]
We propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance.
To enhance the discrimination of regional features with AU relation embedding, we design a task of RoI inpainting to recover the randomly cropped AU patches.
A single image based optical flow estimation task is proposed to leverage the dynamic change of facial muscles.
Based on these two self-supervised auxiliary tasks, local features, mutual relation and motion cues of AUs are better captured in the backbone network.
arXiv Detail & Related papers (2021-07-30T02:39:45Z) - AU-Expression Knowledge Constrained Representation Learning for Facial
Expression Recognition [79.8779790682205]
We propose an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.
We conduct experiments on the challenging uncontrolled datasets to demonstrate the superiority of the proposed framework over current state-of-the-art methods.
arXiv Detail & Related papers (2020-12-29T03:42:04Z) - Adversarial Graph Representation Adaptation for Cross-Domain Facial
Expression Recognition [86.25926461936412]
We propose a novel Adrialversa Graph Representation Adaptation (AGRA) framework that unifies graph representation propagation with adversarial learning for cross-domain holistic-local feature co-adaptation.
We conduct extensive and fair experiments on several popular benchmarks and show that the proposed AGRA framework achieves superior performance over previous state-of-the-art methods.
arXiv Detail & Related papers (2020-08-03T13:27:24Z) - J$\hat{\text{A}}$A-Net: Joint Facial Action Unit Detection and Face
Alignment via Adaptive Attention [57.51255553918323]
We propose a novel end-to-end deep learning framework for joint AU detection and face alignment.
Our framework significantly outperforms the state-of-the-art AU detection methods on the challenging BP4D, DISFA, GFT and BP4D+ benchmarks.
arXiv Detail & Related papers (2020-03-18T12:50:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.