Enhancing Facial Expression Recognition through Dual-Direction Attention Mixed Feature Networks and CLIP: Application to 8th ABAW Challenge
- URL: http://arxiv.org/abs/2503.12260v1
- Date: Sat, 15 Mar 2025 21:03:03 GMT
- Title: Enhancing Facial Expression Recognition through Dual-Direction Attention Mixed Feature Networks and CLIP: Application to 8th ABAW Challenge
- Authors: Josep Cabacas-Maso, Elena Ortega-Beltrán, Ismael Benito-Altamirano, Carles Ventura,
- Abstract summary: We present our contribution to the 8th ABAW challenge at CVPR 2025.<n>We tackle valence-arousal estimation, emotion recognition, and facial action unit detection as three independent challenges.<n>Our approach leverages the well-known Dual-Direction Attention Mixed Feature Network (DDAMFN) for all three tasks, achieving results that surpass the proposed baselines.
- Score: 1.0374615809135401
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present our contribution to the 8th ABAW challenge at CVPR 2025, where we tackle valence-arousal estimation, emotion recognition, and facial action unit detection as three independent challenges. Our approach leverages the well-known Dual-Direction Attention Mixed Feature Network (DDAMFN) for all three tasks, achieving results that surpass the proposed baselines. Additionally, we explore the use of CLIP for the emotion recognition challenge as an additional experiment. We provide insights into the architectural choices that contribute to the strong performance of our methods.
Related papers
- Design of an Expression Recognition Solution Based on the Global Channel-Spatial Attention Mechanism and Proportional Criterion Fusion [11.506800500772734]
This paper aims to introduce the method we will adopt in the 8th Affective and Behavioral Analysis in the Wild (ABAW) Competition.
Based on the residual hybrid convolutional neural network and the multi-branch convolutional neural network respectively, we design feature extraction models for image and audio sequences.
In the facial expression recognition task of the 8th ABAW Competition, our method ranked third on the official validation set.
arXiv Detail & Related papers (2025-03-15T00:59:34Z) - Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering [87.76784654371312]
Embodied Question Answering requires agents to dynamically explore 3D environments, actively gather visual information, and perform multi-step reasoning to answer questions.<n>Existing datasets often introduce biases or prior knowledge, leading to disembodied reasoning.<n>We construct the largest dataset designed specifically to evaluate both exploration and reasoning capabilities.
arXiv Detail & Related papers (2025-03-14T06:29:47Z) - Enhancing Facial Expression Recognition through Dual-Direction Attention Mixed Feature Networks: Application to 7th ABAW Challenge [1.0374615809135401]
We present our contribution to the 7th ABAW challenge at ECCV 2024.
By utilizing a Dual-Direction Attention Mixed Feature Network (DDAMFN) for multitask facial expression recognition, we achieve results far beyond the proposed baseline for the Multi-Task ABAW challenge.
arXiv Detail & Related papers (2024-07-17T08:11:37Z) - Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge [9.638373386602874]
We present our approach to addressing the challenges of the 7th ABAW competition.
The competition comprises three sub-challenges: Valence Arousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU) detection.
arXiv Detail & Related papers (2024-07-17T02:01:34Z) - Affective Behavior Analysis using Task-adaptive and AU-assisted Graph Network [18.304164382834617]
We present our solution and experiment result for the Multi-Task Learning Challenge of the 7th Affective Behavior Analysis in-the-wild(ABAW7) Competition.
This challenge consists of three tasks: action unit detection, facial expression recognition, and valance-arousal estimation.
arXiv Detail & Related papers (2024-07-16T12:33:22Z) - Facial Affective Behavior Analysis with Instruction Tuning [58.332959295770614]
Facial affective behavior analysis (FABA) is crucial for understanding human mental states from images.
Traditional approaches primarily deploy models to discriminate among discrete emotion categories, and lack the fine granularity and reasoning capability for complex facial behaviors.
We introduce an instruction-following dataset for two FABA tasks, emotion and action unit recognition, and a benchmark FABA-Bench with a new metric considering both recognition and generation ability.
We also introduce a facial prior expert module with face structure knowledge and a low-rank adaptation module into pre-trained MLLM.
arXiv Detail & Related papers (2024-04-07T19:23:28Z) - The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition [53.718777420180395]
This paper describes the 6th Affective Behavior Analysis in-the-wild (ABAW) Competition.
The 6th ABAW Competition addresses contemporary challenges in understanding human emotions and behaviors.
arXiv Detail & Related papers (2024-02-29T16:49:38Z) - Learning Diversified Feature Representations for Facial Expression
Recognition in the Wild [97.14064057840089]
We propose a mechanism to diversify the features extracted by CNN layers of state-of-the-art facial expression recognition architectures.
Experimental results on three well-known facial expression recognition in-the-wild datasets, AffectNet, FER+, and RAF-DB, show the effectiveness of our method.
arXiv Detail & Related papers (2022-10-17T19:25:28Z) - Prior Aided Streaming Network for Multi-task Affective Recognitionat the
2nd ABAW2 Competition [9.188777864190204]
We introduce our submission to the 2nd Affective Behavior Analysis in-the-wild (ABAW2) Competition.
In dealing with different emotion representations, we propose a multi-task streaming network.
We leverage an advanced facial expression embedding as prior knowledge.
arXiv Detail & Related papers (2021-07-08T09:35:08Z) - $M^3$T: Multi-Modal Continuous Valence-Arousal Estimation in the Wild [86.40973759048957]
This report describes a multi-modal multi-task ($M3$T) approach underlying our submission to the valence-arousal estimation track of the Affective Behavior Analysis in-the-wild (ABAW) Challenge.
In the proposed $M3$T framework, we fuse both visual features from videos and acoustic features from the audio tracks to estimate the valence and arousal.
We evaluated the $M3$T framework on the validation set provided by ABAW and it significantly outperforms the baseline method.
arXiv Detail & Related papers (2020-02-07T18:53:13Z) - Analysing Affective Behavior in the First ABAW 2020 Competition [49.90617840789334]
The Affective Behavior Analysis in-the-wild (ABAW) 2020 Competition is the first Competition aiming at automatic analysis of the three main behavior tasks.
We describe this Competition, to be held in conjunction with the IEEE Conference on Face and Gesture Recognition, May 2020, in Buenos Aires, Argentina.
We outline the evaluation metrics, present both the baseline system and the top-3 performing teams' methodologies per Challenge and finally present their obtained results.
arXiv Detail & Related papers (2020-01-30T15:41:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.