MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition
- URL: http://arxiv.org/abs/2503.00202v1
- Date: Fri, 28 Feb 2025 21:39:19 GMT
- Title: MIDAS: Mixing Ambiguous Data with Soft Labels for Dynamic Facial Expression Recognition
- Authors: Ryosuke Kawamura, Hideaki Hayashi, Noriko Takemura, Hajime Nagahara,
- Abstract summary: We propose MIDAS, a data augmentation method for dynamic facial expression recognition (DFER)<n>In MIDAS, the training data are augmented by convexly combining pairs of video frames and their corresponding emotion class labels.<n>The results demonstrate that the model trained on the data augmented by MIDAS outperforms the existing state-of-the-art method trained on the original dataset.
- Score: 11.89503569570198
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Dynamic facial expression recognition (DFER) is an important task in the field of computer vision. To apply automatic DFER in practice, it is necessary to accurately recognize ambiguous facial expressions, which often appear in data in the wild. In this paper, we propose MIDAS, a data augmentation method for DFER, which augments ambiguous facial expression data with soft labels consisting of probabilities for multiple emotion classes. In MIDAS, the training data are augmented by convexly combining pairs of video frames and their corresponding emotion class labels, which can also be regarded as an extension of mixup to soft-labeled video data. This simple extension is remarkably effective in DFER with ambiguous facial expression data. To evaluate MIDAS, we conducted experiments on the DFEW dataset. The results demonstrate that the model trained on the data augmented by MIDAS outperforms the existing state-of-the-art method trained on the original dataset.
Related papers
- Disentangled Source-Free Personalization for Facial Expression Recognition with Neutral Target Data [49.25159192831934]
Source-free domain adaptation (SFDA) methods are employed to adapt a pre-trained source model using only unlabeled target domain data.
This paper introduces the Disentangled Source-Free Domain Adaptation (DSFDA) method to address the SFDA challenge posed by missing target expression data.
Our method learns to disentangle features related to expressions and identity while generating the missing non-neutral target data.
arXiv Detail & Related papers (2025-03-26T17:53:53Z) - AffectNet+: A Database for Enhancing Facial Expression Recognition with Soft-Labels [2.644902054473556]
We propose a new approach to create FER datasets through a labeling method in which an image is labeled with more than one emotion.
Finding smoother decision boundaries, enabling multi-labeling, and mitigating bias and imbalanced data are some of the advantages of our proposed method.
Building upon AffectNet, we introduce AffectNet+, the next-generation facial expression dataset.
arXiv Detail & Related papers (2024-10-29T19:57:10Z) - Static for Dynamic: Towards a Deeper Understanding of Dynamic Facial Expressions Using Static Expression Data [83.48170683672427]
We propose a unified dual-modal learning framework that integrates SFER data as a complementary resource for DFER.<n>S4D employs dual-modal self-supervised pre-training on facial images and videos using a shared Transformer (ViT) encoder-decoder architecture.<n>Experiments demonstrate that S4D achieves a deeper understanding of DFER, setting new state-of-the-art performance.
arXiv Detail & Related papers (2024-09-10T01:57:57Z) - From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos [88.08209394979178]
Dynamic facial expression recognition (DFER) in the wild is still hindered by data limitations.
We introduce a novel Static-to-Dynamic model (S2D) that leverages existing SFER knowledge and dynamic information implicitly encoded in extracted facial landmark-aware features.
arXiv Detail & Related papers (2023-12-09T03:16:09Z) - Exploring Large-scale Unlabeled Faces to Enhance Facial Expression
Recognition [12.677143408225167]
We propose a semi-supervised learning framework that utilizes unlabeled face data to train expression recognition models effectively.
Our method uses a dynamic threshold module that can adaptively adjust the confidence threshold to fully utilize the face recognition data.
In the ABAW5 EXPR task, our method achieved excellent results on the official validation set.
arXiv Detail & Related papers (2023-03-15T13:43:06Z) - Cluster-level pseudo-labelling for source-free cross-domain facial
expression recognition [94.56304526014875]
We propose the first Source-Free Unsupervised Domain Adaptation (SFUDA) method for Facial Expression Recognition (FER)
Our method exploits self-supervised pretraining to learn good feature representations from the target data.
We validate the effectiveness of our method in four adaptation setups, proving that it consistently outperforms existing SFUDA methods when applied to FER.
arXiv Detail & Related papers (2022-10-11T08:24:50Z) - Improved Speech Emotion Recognition using Transfer Learning and
Spectrogram Augmentation [56.264157127549446]
Speech emotion recognition (SER) is a challenging task that plays a crucial role in natural human-computer interaction.
One of the main challenges in SER is data scarcity.
We propose a transfer learning strategy combined with spectrogram augmentation.
arXiv Detail & Related papers (2021-08-05T10:39:39Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.