Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge
- URL: http://arxiv.org/abs/2407.12258v2
- Date: Fri, 26 Jul 2024 08:42:10 GMT
- Title: Facial Affect Recognition based on Multi Architecture Encoder and Feature Fusion for the ABAW7 Challenge
- Authors: Kang Shen, Xuxiong Liu, Boyan Wang, Jun Yao, Xin Liu, Yujie Guan, Yu Wang, Gengchen Li, Xiao Sun,
- Abstract summary: We present our approach to addressing the challenges of the 7th ABAW competition.
The competition comprises three sub-challenges: Valence Arousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU) detection.
- Score: 9.638373386602874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we present our approach to addressing the challenges of the 7th ABAW competition. The competition comprises three sub-challenges: Valence Arousal (VA) estimation, Expression (Expr) classification, and Action Unit (AU) detection. To tackle these challenges, we employ state-of-the-art models to extract powerful visual features. Subsequently, a Transformer Encoder is utilized to integrate these features for the VA, Expr, and AU sub-challenges. To mitigate the impact of varying feature dimensions, we introduce an affine module to align the features to a common dimension. Overall, our results significantly outperform the baselines.
Related papers
- Emotion Recognition with CLIP and Sequential Learning [5.66758879852618]
We present our innovative methodology for tackling the Valence-Arousal (VA) Estimation Challenge, the Expression Recognition Challenge, and the Action Unit (AU) Detection Challenge.
Our approach introduces a novel framework aimed at enhancing continuous emotion recognition.
arXiv Detail & Related papers (2025-03-13T01:02:06Z) - Affective Behavior Analysis using Task-adaptive and AU-assisted Graph Network [18.304164382834617]
We present our solution and experiment result for the Multi-Task Learning Challenge of the 7th Affective Behavior Analysis in-the-wild(ABAW7) Competition.
This challenge consists of three tasks: action unit detection, facial expression recognition, and valance-arousal estimation.
arXiv Detail & Related papers (2024-07-16T12:33:22Z) - Deep Content Understanding Toward Entity and Aspect Target Sentiment Analysis on Foundation Models [0.8602553195689513]
Entity-Aspect Sentiment Triplet Extraction (EASTE) is a novel Aspect-Based Sentiment Analysis task.
Our research aims to achieve high performance on the EASTE task and investigates the impact of model size, type, and adaptation techniques on task performance.
Ultimately, we provide detailed insights and achieving state-of-the-art results in complex sentiment analysis.
arXiv Detail & Related papers (2024-07-04T16:48:14Z) - The 6th Affective Behavior Analysis in-the-wild (ABAW) Competition [53.718777420180395]
This paper describes the 6th Affective Behavior Analysis in-the-wild (ABAW) Competition.
The 6th ABAW Competition addresses contemporary challenges in understanding human emotions and behaviors.
arXiv Detail & Related papers (2024-02-29T16:49:38Z) - CONTRASTE: Supervised Contrastive Pre-training With Aspect-based Prompts
For Aspect Sentiment Triplet Extraction [13.077459544929598]
We present a novel pre-training strategy using CONTRastive learning to enhance the ASTE performance.
We also demonstrate the advantage of our proposed technique on other ABSA tasks such as ACOS, TASD, and AESC.
arXiv Detail & Related papers (2023-10-24T07:40:09Z) - LAMBO: Large AI Model Empowered Edge Intelligence [71.56135386994119]
Next-generation edge intelligence is anticipated to benefit various applications via offloading techniques.
Traditional offloading architectures face several issues, including heterogeneous constraints, partial perception, uncertain generalization, and lack of tractability.
We propose a Large AI Model-Based Offloading (LAMBO) framework with over one billion parameters for solving these problems.
arXiv Detail & Related papers (2023-08-29T07:25:42Z) - Facial Affect Recognition based on Transformer Encoder and Audiovisual
Fusion for the ABAW5 Challenge [10.88275919652131]
We present our solutions for four sub-challenges of Valence-Arousal (VA) Estimation, Expression (Expr) Classification, Action Unit (AU) Detection and Emotional Reaction Intensity (ERI) Estimation.
The 5th ABAW competition focuses on facial affect recognition utilizing different modalities and datasets.
arXiv Detail & Related papers (2023-03-16T08:47:36Z) - Part-guided Relational Transformers for Fine-grained Visual Recognition [59.20531172172135]
We propose a framework to learn the discriminative part features and explore correlations with a feature transformation module.
Our proposed approach does not rely on additional part branches and reaches state-the-of-art performance on 3-of-the-level object recognition.
arXiv Detail & Related papers (2022-12-28T03:45:56Z) - Adversarial Feature Augmentation and Normalization for Visual
Recognition [109.6834687220478]
Recent advances in computer vision take advantage of adversarial data augmentation to ameliorate the generalization ability of classification models.
Here, we present an effective and efficient alternative that advocates adversarial augmentation on intermediate feature embeddings.
We validate the proposed approach across diverse visual recognition tasks with representative backbone networks.
arXiv Detail & Related papers (2021-03-22T20:36:34Z) - Hierarchical Variational Autoencoder for Visual Counterfactuals [79.86967775454316]
Conditional Variational Autos (VAE) are gathering significant attention as an Explainable Artificial Intelligence (XAI) tool.
In this paper we show how relaxing the effect of the posterior leads to successful counterfactuals.
We introduce VAEX an Hierarchical VAE designed for this approach that can visually audit a classifier in applications.
arXiv Detail & Related papers (2021-02-01T14:07:11Z) - Generative Partial Visual-Tactile Fused Object Clustering [81.17645983141773]
We propose a Generative Partial Visual-Tactile Fused (i.e., GPVTF) framework for object clustering.
A conditional cross-modal clustering generative adversarial network is then developed to synthesize one modality conditioning on the other modality.
To the end, two pseudo-label based KL-divergence losses are employed to update the corresponding modality-specific encoders.
arXiv Detail & Related papers (2020-12-28T02:37:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.