ExpLLM: Towards Chain of Thought for Facial Expression Recognition
- URL: http://arxiv.org/abs/2409.02828v1
- Date: Wed, 4 Sep 2024 15:50:16 GMT
- Title: ExpLLM: Towards Chain of Thought for Facial Expression Recognition
- Authors: Xing Lan, Jian Xue, Ji Qi, Dongmei Jiang, Ke Lu, Tat-Seng Chua,
- Abstract summary: We propose a novel method called ExpLLM to generate an accurate chain of thought (CoT) for facial expression recognition.
Specifically, we have designed the CoT mechanism from three key perspectives: key observations, overall emotional interpretation, and conclusion.
In experiments on the RAF-DB and AffectNet datasets, ExpLLM outperforms current state-of-the-art FER methods.
- Score: 61.49849866937758
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Facial expression recognition (FER) is a critical task in multimedia with significant implications across various domains. However, analyzing the causes of facial expressions is essential for accurately recognizing them. Current approaches, such as those based on facial action units (AUs), typically provide AU names and intensities but lack insight into the interactions and relationships between AUs and the overall expression. In this paper, we propose a novel method called ExpLLM, which leverages large language models to generate an accurate chain of thought (CoT) for facial expression recognition. Specifically, we have designed the CoT mechanism from three key perspectives: key observations, overall emotional interpretation, and conclusion. The key observations describe the AU's name, intensity, and associated emotions. The overall emotional interpretation provides an analysis based on multiple AUs and their interactions, identifying the dominant emotions and their relationships. Finally, the conclusion presents the final expression label derived from the preceding analysis. Furthermore, we also introduce the Exp-CoT Engine, designed to construct this expression CoT and generate instruction-description data for training our ExpLLM. Extensive experiments on the RAF-DB and AffectNet datasets demonstrate that ExpLLM outperforms current state-of-the-art FER methods. ExpLLM also surpasses the latest GPT-4o in expression CoT generation, particularly in recognizing micro-expressions where GPT-4o frequently fails.
Related papers
- Smile upon the Face but Sadness in the Eyes: Emotion Recognition based on Facial Expressions and Eye Behaviors [63.194053817609024]
We introduce eye behaviors as an important emotional cues for the creation of a new Eye-behavior-aided Multimodal Emotion Recognition dataset.
For the first time, we provide annotations for both Emotion Recognition (ER) and Facial Expression Recognition (FER) in the EMER dataset.
We specifically design a new EMERT architecture to concurrently enhance performance in both ER and FER.
arXiv Detail & Related papers (2024-11-08T04:53:55Z) - EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning [26.95442405140093]
We focus on enhancing the model's proficiency in understanding and adhering to instructions related to emotional contexts.
We introduce a novel GPT-assisted pipeline for generating emotion visual instruction data.
Our proposed EmoVIT architecture incorporates emotion-specific instruction data, leveraging the powerful capabilities of Large Language Models.
arXiv Detail & Related papers (2024-04-25T15:15:36Z) - A Hybrid Approach To Aspect Based Sentiment Analysis Using Transfer Learning [3.30307212568497]
We propose a hybrid approach for Aspect Based Sentiment Analysis using transfer learning.
The approach focuses on generating weakly-supervised annotations by exploiting the strengths of both large language models (LLM) and traditional syntactic dependencies.
arXiv Detail & Related papers (2024-03-25T23:02:33Z) - Understanding Before Recommendation: Semantic Aspect-Aware Review Exploitation via Large Language Models [53.337728969143086]
Recommendation systems harness user-item interactions like clicks and reviews to learn their representations.
Previous studies improve recommendation accuracy and interpretability by modeling user preferences across various aspects and intents.
We introduce a chain-based prompting approach to uncover semantic aspect-aware interactions.
arXiv Detail & Related papers (2023-12-26T15:44:09Z) - CLIPER: A Unified Vision-Language Framework for In-the-Wild Facial
Expression Recognition [1.8604727699812171]
We propose a unified framework for both static and dynamic facial Expression Recognition based on CLIP.
We introduce multiple expression text descriptors (METD) to learn fine-grained expression representations that make CLIPER more interpretable.
arXiv Detail & Related papers (2023-03-01T02:59:55Z) - Stimuli-Aware Visual Emotion Analysis [75.68305830514007]
We propose a stimuli-aware visual emotion analysis (VEA) method consisting of three stages, namely stimuli selection, feature extraction and emotion prediction.
To the best of our knowledge, it is the first time to introduce stimuli selection process into VEA in an end-to-end network.
Experiments demonstrate that the proposed method consistently outperforms the state-of-the-art approaches on four public visual emotion datasets.
arXiv Detail & Related papers (2021-09-04T08:14:52Z) - AU-Expression Knowledge Constrained Representation Learning for Facial
Expression Recognition [79.8779790682205]
We propose an AU-Expression Knowledge Constrained Representation Learning (AUE-CRL) framework to learn the AU representations without AU annotations and adaptively use representations to facilitate facial expression recognition.
We conduct experiments on the challenging uncontrolled datasets to demonstrate the superiority of the proposed framework over current state-of-the-art methods.
arXiv Detail & Related papers (2020-12-29T03:42:04Z) - RAF-AU Database: In-the-Wild Facial Expressions with Subjective Emotion
Judgement and Objective AU Annotations [36.93475723886278]
We develop a RAF-AU database that employs a sign-based (i.e., AUs) and judgement-based (i.e., perceived emotion) approach to annotating blended facial expressions in the wild.
We also conduct a preliminary investigation of which key AUs contribute most to a perceived emotion, and the relationship between AUs and facial expressions.
arXiv Detail & Related papers (2020-08-12T09:29:16Z) - ICE-GAN: Identity-aware and Capsule-Enhanced GAN with Graph-based
Reasoning for Micro-Expression Recognition and Synthesis [26.414187427071063]
We propose a novel Identity-aware and Capsule-Enhanced Generative Adversarial Network with graph-based reasoning (ICE-GAN)
The generator produces synthetic faces with controllable micro-expressions and identity-aware features, whose long-ranged dependencies are captured through the graph reasoning module (GRM)
Our ICE-GAN was evaluated on Micro-Expression Grand Challenge 2019 (MEGC 2019) with a significant improvement (12.9%) over the winner and surpassed other state-of-the-art methods.
arXiv Detail & Related papers (2020-05-09T05:37:44Z) - Learning to Augment Expressions for Few-shot Fine-grained Facial
Expression Recognition [98.83578105374535]
We present a novel Fine-grained Facial Expression Database - F2ED.
It includes more than 200k images with 54 facial expressions from 119 persons.
Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we evaluate several tasks of few-shot expression learning.
We propose a unified task-driven framework - Compositional Generative Adversarial Network (Comp-GAN) learning to synthesize facial images.
arXiv Detail & Related papers (2020-01-17T03:26:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.