Multi-scale Promoted Self-adjusting Correlation Learning for Facial
Action Unit Detection
- URL: http://arxiv.org/abs/2308.07770v1
- Date: Tue, 15 Aug 2023 13:43:48 GMT
- Title: Multi-scale Promoted Self-adjusting Correlation Learning for Facial
Action Unit Detection
- Authors: Xin Liu, Kaishen Yuan, Xuesong Niu, Jingang Shi, Zitong Yu, Huanjing
Yue, Jingyu Yang
- Abstract summary: Facial Action Unit (AU) detection is a crucial task in affective computing and social robotics.
Previous methods used fixed AU correlations based on expert experience or statistical rules on specific benchmarks.
This paper proposes a novel self-adjusting AU-correlation learning (SACL) method with less proposes for AU detection.
- Score: 37.841035367349434
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial Action Unit (AU) detection is a crucial task in affective computing
and social robotics as it helps to identify emotions expressed through facial
expressions. Anatomically, there are innumerable correlations between AUs,
which contain rich information and are vital for AU detection. Previous methods
used fixed AU correlations based on expert experience or statistical rules on
specific benchmarks, but it is challenging to comprehensively reflect complex
correlations between AUs via hand-crafted settings. There are alternative
methods that employ a fully connected graph to learn these dependencies
exhaustively. However, these approaches can result in a computational explosion
and high dependency with a large dataset. To address these challenges, this
paper proposes a novel self-adjusting AU-correlation learning (SACL) method
with less computation for AU detection. This method adaptively learns and
updates AU correlation graphs by efficiently leveraging the characteristics of
different levels of AU motion and emotion representation information extracted
in different stages of the network. Moreover, this paper explores the role of
multi-scale learning in correlation information extraction, and design a simple
yet effective multi-scale feature learning (MSFL) method to promote better
performance in AU detection. By integrating AU correlation information with
multi-scale features, the proposed method obtains a more robust feature
representation for the final AU detection. Extensive experiments show that the
proposed method outperforms the state-of-the-art methods on widely used AU
detection benchmark datasets, with only 28.7\% and 12.0\% of the parameters and
FLOPs of the best method, respectively. The code for this method is available
at \url{https://github.com/linuxsino/Self-adjusting-AU}.
Related papers
- Anchor-aware Deep Metric Learning for Audio-visual Retrieval [11.675472891647255]
Metric learning aims at capturing the underlying data structure and enhancing the performance of tasks like audio-visual cross-modal retrieval (AV-CMR)
Recent works employ sampling methods to select impactful data points from the embedding space during training.
However, the model training fails to fully explore the space due to the scarcity of training data points.
We propose an innovative Anchor-aware Deep Metric Learning (AADML) method to address this challenge.
arXiv Detail & Related papers (2024-04-21T22:44:44Z) - C-ICL: Contrastive In-context Learning for Information Extraction [54.39470114243744]
c-ICL is a novel few-shot technique that leverages both correct and incorrect sample constructions to create in-context learning demonstrations.
Our experiments on various datasets indicate that c-ICL outperforms previous few-shot in-context learning methods.
arXiv Detail & Related papers (2024-02-17T11:28:08Z) - Learning Contrastive Feature Representations for Facial Action Unit Detection [13.834540490373818]
Facial action unit (AU) detection has long encountered the challenge of detecting subtle feature differences when AUs activate.
We introduce a novel contrastive learning framework aimed for AU detection that incorporates both self-supervised and supervised signals.
arXiv Detail & Related papers (2024-02-09T03:48:20Z) - Local Region Perception and Relationship Learning Combined with Feature
Fusion for Facial Action Unit Detection [12.677143408225167]
We introduce our submission to the CVPR 2023 Competition on Affective Behavior Analysis in-the-wild (ABAW)
We propose a single-stage trained AU detection framework. Specifically, in order to effectively extract facial local region features related to AU detection, we use a local region perception module.
We also use a graph neural network-based relational learning module to capture the relationship between AUs.
arXiv Detail & Related papers (2023-03-15T11:59:24Z) - Self-supervised Facial Action Unit Detection with Region and Relation
Learning [5.182661263082065]
We propose a novel self-supervised framework for AU detection with the region and relation learning.
An improved Optimal Transport (OT) algorithm is introduced to exploit the correlation characteristics among AUs.
Swin Transformer is exploited to model the long-distance dependencies within each AU region during feature learning.
arXiv Detail & Related papers (2023-03-10T05:22:45Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Combining Feature and Instance Attribution to Detect Artifacts [62.63504976810927]
We propose methods to facilitate identification of training data artifacts.
We show that this proposed training-feature attribution approach can be used to uncover artifacts in training data.
We execute a small user study to evaluate whether these methods are useful to NLP researchers in practice.
arXiv Detail & Related papers (2021-07-01T09:26:13Z) - Meta Auxiliary Learning for Facial Action Unit Detection [84.22521265124806]
We consider learning AU detection and facial expression recognition in a multi-task manner.
The performance of the AU detection task cannot be always enhanced due to the negative transfer in the multi-task scenario.
We propose a Meta Learning method (MAL) that automatically selects highly related FE samples by learning adaptative weights for the training FE samples in a meta learning manner.
arXiv Detail & Related papers (2021-05-14T02:28:40Z) - DEALIO: Data-Efficient Adversarial Learning for Imitation from
Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator.
Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms.
This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk.
We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z) - Multi-Pretext Attention Network for Few-shot Learning with
Self-supervision [37.6064643502453]
We propose a novel augmentation-free method for self-supervised learning, which does not rely on any auxiliary sample.
Besides, we propose Multi-pretext Attention Network (MAN), which exploits a specific attention mechanism to combine the traditional augmentation-relied methods and our GC.
We evaluate our MAN extensively on miniImageNet and tieredImageNet datasets and the results demonstrate that the proposed method outperforms the state-of-the-art (SOTA) relevant methods.
arXiv Detail & Related papers (2021-03-10T10:48:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.