EmotiEffNet Facial Features in Uni-task Emotion Recognition in Video at
ABAW-5 competition
- URL: http://arxiv.org/abs/2303.09162v1
- Date: Thu, 16 Mar 2023 08:57:33 GMT
- Title: EmotiEffNet Facial Features in Uni-task Emotion Recognition in Video at
ABAW-5 competition
- Authors: Andrey V. Savchenko
- Abstract summary: We present the results of our team for the fifth Affective Behavior Analysis in-the-wild (ABAW) competition.
The usage of the pre-trained convolutional networks from the EmotiEffNet family for frame-level feature extraction is studied.
- Score: 7.056222499095849
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this article, the results of our team for the fifth Affective Behavior
Analysis in-the-wild (ABAW) competition are presented. The usage of the
pre-trained convolutional networks from the EmotiEffNet family for frame-level
feature extraction is studied. In particular, we propose an ensemble of a
multi-layered perceptron and the LightAutoML-based classifier. The
post-processing by smoothing the results for sequential frames is implemented.
Experimental results for the large-scale Aff-Wild2 database demonstrate that
our model achieves a much greater macro-averaged F1-score for facial expression
recognition and action unit detection and concordance correlation coefficients
for valence/arousal estimation when compared to baseline.
Related papers
- HSEmotion Team at ABAW-8 Competition: Audiovisual Ambivalence/Hesitancy, Emotional Mimicry Intensity and Facial Expression Recognition [16.860963320038902]
This article presents our results for the eighth Affective Behavior Analysis in-the-Wild (ABAW) competition.
We combine facial emotional descriptors extracted by pre-trained models with acoustic features and embeddings of texts recognized from speech.
The video-level prediction of emotional mimicry intensity is implemented by simply aggregating frame-level features and training a multi-layered perceptron.
arXiv Detail & Related papers (2025-03-13T14:21:46Z) - HSEmotion Team at the 7th ABAW Challenge: Multi-Task Learning and Compound Facial Expression Recognition [16.860963320038902]
We describe the results of the HSEmotion team in two tasks of the seventh Affective Behavior Analysis in-the-wild (ABAW) competition.
We propose an efficient pipeline based on frame-level facial feature extractors pre-trained in multi-task settings.
We ensure the privacy-awareness of our techniques by using the lightweight architectures of neural networks.
arXiv Detail & Related papers (2024-07-18T05:47:49Z) - Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction [54.23208041792073]
Aspect Sentiment Quad Prediction (ASQP) aims to predict all quads (aspect term, aspect category, opinion term, sentiment polarity) for a given review.
A key challenge in the ASQP task is the scarcity of labeled data, which limits the performance of existing methods.
We propose a self-training framework with a pseudo-label scorer, wherein a scorer assesses the match between reviews and their pseudo-labels.
arXiv Detail & Related papers (2024-06-26T05:30:21Z) - HSEmotion Team at the 6th ABAW Competition: Facial Expressions, Valence-Arousal and Emotion Intensity Prediction [16.860963320038902]
We study the possibility of using pre-trained deep models that extract reliable emotional features without the need to fine-tune the neural networks for a downstream task.
We introduce several lightweight models based on MobileViT, MobileFaceNet, EfficientNet, and DFNDAM architectures trained in multi-task scenarios to recognize facial expressions.
Our approach lets us significantly improve quality metrics on validation sets compared to existing non-ensemble techniques.
arXiv Detail & Related papers (2024-03-18T09:08:41Z) - Unveiling Backbone Effects in CLIP: Exploring Representational Synergies
and Variances [49.631908848868505]
Contrastive Language-Image Pretraining (CLIP) stands out as a prominent method for image representation learning.
We investigate the differences in CLIP performance among various neural architectures.
We propose a simple, yet effective approach to combine predictions from multiple backbones, leading to a notable performance boost of up to 6.34%.
arXiv Detail & Related papers (2023-12-22T03:01:41Z) - Multi-modal Facial Affective Analysis based on Masked Autoencoder [7.17338843593134]
We introduce our submission to the CVPR 2023: ABAW5 competition: Affective Behavior Analysis in-the-wild.
Our approach involves several key components. First, we utilize the visual information from a Masked Autoencoder(MAE) model that has been pre-trained on a large-scale face image dataset in a self-supervised manner.
Our approach achieves impressive results in the ABAW5 competition, with an average F1 score of 55.49% and 41.21% in the AU and EXPR tracks, respectively.
arXiv Detail & Related papers (2023-03-20T03:58:03Z) - Facial Affect Recognition based on Transformer Encoder and Audiovisual
Fusion for the ABAW5 Challenge [10.88275919652131]
We present our solutions for four sub-challenges of Valence-Arousal (VA) Estimation, Expression (Expr) Classification, Action Unit (AU) Detection and Emotional Reaction Intensity (ERI) Estimation.
The 5th ABAW competition focuses on facial affect recognition utilizing different modalities and datasets.
arXiv Detail & Related papers (2023-03-16T08:47:36Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Frame-level Prediction of Facial Expressions, Valence, Arousal and
Action Units for Mobile Devices [7.056222499095849]
We propose the novel frame-level emotion recognition algorithm by extracting facial features with the single EfficientNet model pre-trained on AffectNet.
Our approach may be implemented even for video analytics on mobile devices.
arXiv Detail & Related papers (2022-03-25T03:53:27Z) - Consistency Regularization for Deep Face Anti-Spoofing [69.70647782777051]
Face anti-spoofing (FAS) plays a crucial role in securing face recognition systems.
Motivated by this exciting observation, we conjecture that encouraging feature consistency of different views may be a promising way to boost FAS models.
We enhance both Embedding-level and Prediction-level Consistency Regularization (EPCR) in FAS.
arXiv Detail & Related papers (2021-11-24T08:03:48Z) - No Fear of Heterogeneity: Classifier Calibration for Federated Learning
with Non-IID Data [78.69828864672978]
A central challenge in training classification models in the real-world federated system is learning with non-IID data.
We propose a novel and simple algorithm called Virtual Representations (CCVR), which adjusts the classifier using virtual representations sampled from an approximated ssian mixture model.
Experimental results demonstrate that CCVR state-of-the-art performance on popular federated learning benchmarks including CIFAR-10, CIFAR-100, and CINIC-10.
arXiv Detail & Related papers (2021-06-09T12:02:29Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.