Interpretable Image Emotion Recognition: A Domain Adaptation Approach Using Facial Expressions
- URL: http://arxiv.org/abs/2011.08388v3
- Date: Wed, 28 Aug 2024 22:05:07 GMT
- Title: Interpretable Image Emotion Recognition: A Domain Adaptation Approach Using Facial Expressions
- Authors: Puneet Kumar, Balasubramanian Raman,
- Abstract summary: This paper proposes a feature-based domain adaptation technique for identifying emotions in generic images.
It addresses the challenge of the limited availability of pre-trained models and well-annotated datasets for Image Emotion Recognition (IER)
The proposed IER system demonstrated emotion classification accuracies of 60.98% for the IAPSa dataset, 58.86% for the ArtPhoto dataset, 69.13% for the FI dataset, and 58.06% for the EMOTIC dataset.
- Score: 11.808447247077902
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper proposes a feature-based domain adaptation technique for identifying emotions in generic images, encompassing both facial and non-facial objects, as well as non-human components. This approach addresses the challenge of the limited availability of pre-trained models and well-annotated datasets for Image Emotion Recognition (IER). Initially, a deep-learning-based Facial Expression Recognition (FER) system is developed, classifying facial images into discrete emotion classes. Maintaining the same network architecture, this FER system is then adapted to recognize emotions in generic images through the application of discrepancy loss, enabling the model to effectively learn IER features while classifying emotions into categories such as 'happy,' 'sad,' 'hate,' and 'anger.' Additionally, a novel interpretability method, Divide and Conquer based Shap (DnCShap), is introduced to elucidate the visual features most relevant for emotion recognition. The proposed IER system demonstrated emotion classification accuracies of 60.98% for the IAPSa dataset, 58.86% for the ArtPhoto dataset, 69.13% for the FI dataset, and 58.06% for the EMOTIC dataset. The system effectively identifies the important visual features leading to specific emotion classifications and provides detailed embedding plots to explain the predictions, enhancing the understanding and trust in AI-driven emotion recognition systems.
Related papers
- Emotion Detection through Body Gesture and Face [0.0]
The project addresses the challenge of emotion recognition by focusing on non-facial cues, specifically hands, body gestures, and gestures.
Traditional emotion recognition systems mainly rely on facial expression analysis and often ignore the rich emotional information conveyed through body language.
The project aims to contribute to the field of affective computing by enhancing the ability of machines to interpret and respond to human emotions in a more comprehensive and nuanced way.
arXiv Detail & Related papers (2024-07-13T15:15:50Z) - Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models [49.3179290313959]
The proposed method, emotion-centered generative replay (ECgr), tackles this challenge by integrating synthetic images from generative adversarial networks.
ECgr incorporates a quality assurance algorithm to ensure the fidelity of generated images.
The experimental results on four diverse facial expression datasets demonstrate that incorporating images generated by our pseudo-rehearsal method enhances training on the targeted dataset and the source dataset.
arXiv Detail & Related papers (2024-04-18T15:28:34Z) - Music Recommendation Based on Facial Emotion Recognition [0.0]
This paper presents a comprehensive approach to enhancing the user experience through the integration of emotion recognition, music recommendation, and explainable AI using GRAD-CAM.
The proposed methodology utilizes a ResNet50 model trained on the Facial Expression Recognition dataset, consisting of real images of individuals expressing various emotions.
arXiv Detail & Related papers (2024-04-06T15:14:25Z) - StyleEDL: Style-Guided High-order Attention Network for Image Emotion
Distribution Learning [69.06749934902464]
We propose a style-guided high-order attention network for image emotion distribution learning termed StyleEDL.
StyleEDL interactively learns stylistic-aware representations of images by exploring the hierarchical stylistic information of visual contents.
In addition, we introduce a stylistic graph convolutional network to dynamically generate the content-dependent emotion representations.
arXiv Detail & Related papers (2023-08-06T03:22:46Z) - SAFER: Situation Aware Facial Emotion Recognition [0.0]
We present SAFER, a novel system for emotion recognition from facial expressions.
It employs state-of-the-art deep learning techniques to extract various features from facial images.
It can adapt to unseen and varied facial expressions, making it suitable for real-world applications.
arXiv Detail & Related papers (2023-06-14T20:42:26Z) - Interpretable Multimodal Emotion Recognition using Hybrid Fusion of
Speech and Image Data [15.676632465869346]
A new interpretability technique has been developed to identify the important speech & image features leading to the prediction of particular emotion classes.
The proposed system has achieved 83.29% accuracy for emotion recognition.
arXiv Detail & Related papers (2022-08-25T04:43:34Z) - CIAO! A Contrastive Adaptation Mechanism for Non-Universal Facial
Expression Recognition [80.07590100872548]
We propose Contrastive Inhibitory Adaptati On (CIAO), a mechanism that adapts the last layer of facial encoders to depict specific affective characteristics on different datasets.
CIAO presents an improvement in facial expression recognition performance over six different datasets with very unique affective representations.
arXiv Detail & Related papers (2022-08-10T15:46:05Z) - Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers [57.1091606948826]
We propose a novel FER model, named Poker Face Vision Transformer or PF-ViT, to address these challenges.
PF-ViT aims to separate and recognize the disturbance-agnostic emotion from a static facial image via generating its corresponding poker face.
PF-ViT utilizes vanilla Vision Transformers, and its components are pre-trained as Masked Autoencoders on a large facial expression dataset.
arXiv Detail & Related papers (2022-07-22T13:39:06Z) - SOLVER: Scene-Object Interrelated Visual Emotion Reasoning Network [83.27291945217424]
We propose a novel Scene-Object interreLated Visual Emotion Reasoning network (SOLVER) to predict emotions from images.
To mine the emotional relationships between distinct objects, we first build up an Emotion Graph based on semantic concepts and visual features.
We also design a Scene-Object Fusion Module to integrate scenes and objects, which exploits scene features to guide the fusion process of object features with the proposed scene-based attention mechanism.
arXiv Detail & Related papers (2021-10-24T02:41:41Z) - Joint Deep Learning of Facial Expression Synthesis and Recognition [97.19528464266824]
We propose a novel joint deep learning of facial expression synthesis and recognition method for effective FER.
The proposed method involves a two-stage learning procedure. Firstly, a facial expression synthesis generative adversarial network (FESGAN) is pre-trained to generate facial images with different facial expressions.
In order to alleviate the problem of data bias between the real images and the synthetic images, we propose an intra-class loss with a novel real data-guided back-propagation (RDBP) algorithm.
arXiv Detail & Related papers (2020-02-06T10:56:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.