Learning Multi-dimensional Edge Feature-based AU Relation Graph for
Facial Action Unit Recognition
- URL: http://arxiv.org/abs/2205.01782v1
- Date: Mon, 2 May 2022 03:38:00 GMT
- Title: Learning Multi-dimensional Edge Feature-based AU Relation Graph for
Facial Action Unit Recognition
- Authors: Cheng Luo, Siyang Song, Weicheng Xie, Linlin Shen, Hatice Gunes
- Abstract summary: The activations of Facial Action Units (AUs) mutually influence one another.
Existing approaches fail to specifically and explicitly represent such cues for each pair of AUs in each facial display.
This paper proposes an AU relationship modelling approach that deep learns a unique graph to explicitly describe the relationship between each pair of AUs.
- Score: 27.34564955127377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The activations of Facial Action Units (AUs) mutually influence one another.
While the relationship between a pair of AUs can be complex and unique,
existing approaches fail to specifically and explicitly represent such cues for
each pair of AUs in each facial display. This paper proposes an AU relationship
modelling approach that deep learns a unique graph to explicitly describe the
relationship between each pair of AUs of the target facial display. Our
approach first encodes each AU's activation status and its association with
other AUs into a node feature. Then, it learns a pair of multi-dimensional edge
features to describe multiple task-specific relationship cues between each pair
of AUs. During both node and edge feature learning, our approach also considers
the influence of the unique facial display on AUs' relationship by taking the
full face representation as an input. Experimental results on BP4D and DISFA
datasets show that both node and edge feature learning modules provide large
performance improvements for CNN and transformer-based backbones, with our best
systems achieving the state-of-the-art AU recognition results. Our approach not
only has a strong capability in modelling relationship cues for AU recognition
but also can be easily incorporated into various backbones. Our PyTorch code is
made available.
Related papers
- Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition [38.62221940006509]
Human facial action units (AUs) are mutually related in a hierarchical manner.
AUs located in the same/close facial regions show stronger relationships than those of different facial regions.
This paper proposes a novel multi-scale AU model for occurrence recognition.
arXiv Detail & Related papers (2024-04-09T16:45:34Z) - Contrastive Learning of Person-independent Representations for Facial
Action Unit Detection [70.60587475492065]
We formulate the self-supervised AU representation learning signals in two-fold.
We contrast learn the AU representation within a video clip and devise a cross-identity reconstruction mechanism to learn the person-independent representations.
Our method outperforms other contrastive learning methods and significantly closes the performance gap between the self-supervised and supervised AU detection approaches.
arXiv Detail & Related papers (2024-03-06T01:49:28Z) - Disentangled Interaction Representation for One-Stage Human-Object
Interaction Detection [70.96299509159981]
Human-Object Interaction (HOI) detection is a core task for human-centric image understanding.
Recent one-stage methods adopt a transformer decoder to collect image-wide cues that are useful for interaction prediction.
Traditional two-stage methods benefit significantly from their ability to compose interaction features in a disentangled and explainable manner.
arXiv Detail & Related papers (2023-12-04T08:02:59Z) - Attention Based Relation Network for Facial Action Units Recognition [8.522262699196412]
We propose a novel Attention Based Relation Network (ABRNet) for AU recognition.
ABRNet uses several relation learning layers to automatically capture different AU relations.
Our approach achieves state-of-the-art performance on the DISFA and DISFA+ datasets.
arXiv Detail & Related papers (2022-10-23T11:26:53Z) - TransFA: Transformer-based Representation for Face Attribute Evaluation [87.09529826340304]
We propose a novel textbftransformer-based representation for textbfattribute evaluation method (textbfTransFA)
The proposed TransFA achieves superior performances compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-07-12T10:58:06Z) - MGRR-Net: Multi-level Graph Relational Reasoning Network for Facial Action Units Detection [16.261362598190807]
The Facial Action Coding System (FACS) encodes the action units (AUs) in facial images.
We argue that encoding AU features just from one perspective may not capture the rich contextual information between regional and global face features.
We propose a novel Multi-level Graph Reasoning Network (termed MGRR-Net) for facial AU detection.
arXiv Detail & Related papers (2022-04-04T09:47:22Z) - AU-Expression Knowledge Constrained Representation Learning for Facial
Expression Recognition [79.8779790682205]
We propose an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.
We conduct experiments on the challenging uncontrolled datasets to demonstrate the superiority of the proposed framework over current state-of-the-art methods.
arXiv Detail & Related papers (2020-12-29T03:42:04Z) - Facial Action Unit Intensity Estimation via Semantic Correspondence
Learning with Dynamic Graph Convolution [27.48620879003556]
We present a new learning framework that automatically learns the latent relationships of AUs via establishing semantic correspondences between feature maps.
In the heatmap regression-based network, feature maps preserve rich semantic information associated with AU intensities and locations.
This motivates us to model the correlation among feature channels, which implicitly represents the co-occurrence relationship of AU intensity levels.
arXiv Detail & Related papers (2020-04-20T23:55:30Z) - Multi-Granularity Reference-Aided Attentive Feature Aggregation for
Video-based Person Re-identification [98.7585431239291]
Video-based person re-identification aims at matching the same person across video clips.
In this paper, we propose an attentive feature aggregation module, namely Multi-Granularity Reference-Attentive Feature aggregation module MG-RAFA.
Our framework achieves the state-of-the-art ablation performance on three benchmark datasets.
arXiv Detail & Related papers (2020-03-27T03:49:21Z) - Deep Multi-task Multi-label CNN for Effective Facial Attribute
Classification [53.58763562421771]
We propose a novel deep multi-task multi-label CNN, termed DMM-CNN, for effective Facial Attribute Classification (FAC)
Specifically, DMM-CNN jointly optimize two closely-related tasks (i.e., facial landmark detection and FAC) to improve the performance of FAC by taking advantage of multi-task learning.
Two different network architectures are respectively designed to extract features for two groups of attributes, and a novel dynamic weighting scheme is proposed to automatically assign the loss weight to each facial attribute during training.
arXiv Detail & Related papers (2020-02-10T12:34:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.