Contrastive Learning of Person-independent Representations for Facial
Action Unit Detection
- URL: http://arxiv.org/abs/2403.03400v1
- Date: Wed, 6 Mar 2024 01:49:28 GMT
- Title: Contrastive Learning of Person-independent Representations for Facial
Action Unit Detection
- Authors: Yong Li, Shiguang Shan
- Abstract summary: We formulate the self-supervised AU representation learning signals in two-fold.
We contrast learn the AU representation within a video clip and devise a cross-identity reconstruction mechanism to learn the person-independent representations.
Our method outperforms other contrastive learning methods and significantly closes the performance gap between the self-supervised and supervised AU detection approaches.
- Score: 70.60587475492065
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Facial action unit (AU) detection, aiming to classify AU present in the
facial image, has long suffered from insufficient AU annotations. In this
paper, we aim to mitigate this data scarcity issue by learning AU
representations from a large number of unlabelled facial videos in a
contrastive learning paradigm. We formulate the self-supervised AU
representation learning signals in two-fold: (1) AU representation should be
frame-wisely discriminative within a short video clip; (2) Facial frames
sampled from different identities but show analogous facial AUs should have
consistent AU representations. As to achieve these goals, we propose to
contrastively learn the AU representation within a video clip and devise a
cross-identity reconstruction mechanism to learn the person-independent
representations. Specially, we adopt a margin-based temporal contrastive
learning paradigm to perceive the temporal AU coherence and evolution
characteristics within a clip that consists of consecutive input facial frames.
Moreover, the cross-identity reconstruction mechanism facilitates pushing the
faces from different identities but show analogous AUs close in the latent
embedding space. Experimental results on three public AU datasets demonstrate
that the learned AU representation is discriminative for AU detection. Our
method outperforms other contrastive learning methods and significantly closes
the performance gap between the self-supervised and supervised AU detection
approaches.
Related papers
- A Probabilistic Model Behind Self-Supervised Learning [53.64989127914936]
In self-supervised learning (SSL), representations are learned via an auxiliary task without annotated labels.
We present a generative latent variable model for self-supervised learning.
We show that several families of discriminative SSL, including contrastive methods, induce a comparable distribution over representations.
arXiv Detail & Related papers (2024-02-02T13:31:17Z) - CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial
Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets.
CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z) - Cross-subject Action Unit Detection with Meta Learning and
Transformer-based Relation Modeling [7.395396464857193]
The paper proposes a meta-learning-based cross-subject AU detection model to eliminate the identity-caused differences.
A transformer-based relation learning module is introduced to learn the latent relations of multiple AUs.
Our results prove that on the two public datasets BP4D and DISFA, our method is superior to the state-of-the-art technology.
arXiv Detail & Related papers (2022-05-18T08:17:59Z) - Learning Multi-dimensional Edge Feature-based AU Relation Graph for
Facial Action Unit Recognition [27.34564955127377]
The activations of Facial Action Units (AUs) mutually influence one another.
Existing approaches fail to specifically and explicitly represent such cues for each pair of AUs in each facial display.
This paper proposes an AU relationship modelling approach that deep learns a unique graph to explicitly describe the relationship between each pair of AUs.
arXiv Detail & Related papers (2022-05-02T03:38:00Z) - Evaluation of Self-taught Learning-based Representations for Facial
Emotion Recognition [62.30451764345482]
This work describes different strategies to generate unsupervised representations obtained through the concept of self-taught learning for facial emotion recognition.
The idea is to create complementary representations promoting diversity by varying the autoencoders' initialization, architecture, and training data.
Experimental results on Jaffe and Cohn-Kanade datasets using a leave-one-subject-out protocol show that FER methods based on the proposed diverse representations compare favorably against state-of-the-art approaches.
arXiv Detail & Related papers (2022-04-26T22:48:15Z) - Weakly Supervised Regional and Temporal Learning for Facial Action Unit
Recognition [36.350407471391065]
We propose two auxiliary AU related tasks to bridge the gap between limited annotations and the model performance.
A single image based optical flow estimation task is proposed to leverage the dynamic change of facial muscles.
By incorporating semi-supervised learning, we propose an end-to-end trainable framework named weakly supervised regional and temporal learning.
arXiv Detail & Related papers (2022-04-01T12:02:01Z) - Exploring Adversarial Learning for Deep Semi-Supervised Facial Action
Unit Recognition [38.589141957375226]
We propose a deep semi-supervised framework for facial action unit recognition from partially AU-labeled facial images.
The proposed approach successfully captures AU distributions through adversarial learning and outperforms state-of-the-art AU recognition work.
arXiv Detail & Related papers (2021-06-04T04:50:00Z) - AU-Expression Knowledge Constrained Representation Learning for Facial
Expression Recognition [79.8779790682205]
We propose an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.
We conduct experiments on the challenging uncontrolled datasets to demonstrate the superiority of the proposed framework over current state-of-the-art methods.
arXiv Detail & Related papers (2020-12-29T03:42:04Z) - Fully Unsupervised Person Re-identification viaSelective Contrastive
Learning [58.5284246878277]
Person re-identification (ReID) aims at searching the same identity person among images captured by various cameras.
We propose a novel selective contrastive learning framework for unsupervised feature learning.
Experimental results demonstrate the superiority of our method in unsupervised person ReID compared with the state-of-the-arts.
arXiv Detail & Related papers (2020-10-15T09:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.