GiMeFive: Towards Interpretable Facial Emotion Classification
- URL: http://arxiv.org/abs/2402.15662v1
- Date: Sat, 24 Feb 2024 00:37:37 GMT
- Title: GiMeFive: Towards Interpretable Facial Emotion Classification
- Authors: Jiawen Wang and Leah Kawka
- Abstract summary: Deep convolutional neural networks have been shown to successfully recognize facial emotions.
We propose our model GiMeFive with interpretations, i.e., via layer activations and gradient-weighted class mapping.
Empirical results show that our model outperforms the previous methods in terms of accuracy.
- Score: 1.1468563069298348
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep convolutional neural networks have been shown to successfully recognize
facial emotions for the past years in the realm of computer vision. However,
the existing detection approaches are not always reliable or explainable, we
here propose our model GiMeFive with interpretations, i.e., via layer
activations and gradient-weighted class activation mapping. We compare against
the state-of-the-art methods to classify the six facial emotions. Empirical
results show that our model outperforms the previous methods in terms of
accuracy on two Facial Emotion Recognition (FER) benchmarks and our aggregated
FER GiMeFive. Furthermore, we explain our work in real-world image and video
examples, as well as real-time live camera streams. Our code and supplementary
material are available at https: //github.com/werywjw/SEP-CVDL.
Related papers
- Authentic Emotion Mapping: Benchmarking Facial Expressions in Real News [21.707761612280304]
We present a novel benchmark for Emotion Recognition using facial landmarks extracted from realistic news videos.
Traditional methods relying on RGB images are resource-intensive, whereas our approach with Facial Landmark Emotion Recognition (FLER) offers a simplified yet effective alternative.
arXiv Detail & Related papers (2024-04-21T00:14:03Z) - PERI: Part Aware Emotion Recognition In The Wild [4.206175795966693]
This paper focuses on emotion recognition using visual features.
We create part aware spatial (PAS) images by extracting key regions from the input image using a mask generated from both body pose and facial landmarks.
We provide our results on the publicly available in the wild EMOTIC dataset.
arXiv Detail & Related papers (2022-10-18T20:01:40Z) - Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers [57.1091606948826]
We propose a novel FER model, named Poker Face Vision Transformer or PF-ViT, to address these challenges.
PF-ViT aims to separate and recognize the disturbance-agnostic emotion from a static facial image via generating its corresponding poker face.
PF-ViT utilizes vanilla Vision Transformers, and its components are pre-trained as Masked Autoencoders on a large facial expression dataset.
arXiv Detail & Related papers (2022-07-22T13:39:06Z) - EMOCA: Emotion Driven Monocular Face Capture and Animation [59.15004328155593]
We introduce a novel deep perceptual emotion consistency loss during training, which helps ensure that the reconstructed 3D expression matches the expression depicted in the input image.
On the task of in-the-wild emotion recognition, our purely geometric approach is on par with the best image-based methods, highlighting the value of 3D geometry in analyzing human behavior.
arXiv Detail & Related papers (2022-04-24T15:58:35Z) - Exploiting Emotional Dependencies with Graph Convolutional Networks for
Facial Expression Recognition [31.40575057347465]
This paper proposes a novel multi-task learning framework to recognize facial expressions in-the-wild.
A shared feature representation is learned for both discrete and continuous recognition in a MTL setting.
The results of our experiments show that our method outperforms the current state-of-the-art methods on discrete FER.
arXiv Detail & Related papers (2021-06-07T10:20:05Z) - A Multi-resolution Approach to Expression Recognition in the Wild [9.118706387430883]
We propose a multi-resolution approach to solve the Facial Expression Recognition task.
We ground our intuition on the observation that often faces images are acquired at different resolutions.
To our aim, we use a ResNet-like architecture, equipped with Squeeze-and-Excitation blocks, trained on the Affect-in-the-Wild 2 dataset.
arXiv Detail & Related papers (2021-03-09T21:21:02Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Synthetic Expressions are Better Than Real for Learning to Detect Facial
Actions [4.4532095214807965]
Our approach reconstructs the 3D shape of the face from each video frame, aligns the 3D mesh to a canonical view, and then trains a GAN-based network to synthesize novel images with facial action units of interest.
The network trained on synthesized facial expressions outperformed the one trained on actual facial expressions and surpassed current state-of-the-art approaches.
arXiv Detail & Related papers (2020-10-21T13:11:45Z) - The FaceChannel: A Fast & Furious Deep Neural Network for Facial
Expression Recognition [71.24825724518847]
Current state-of-the-art models for automatic Facial Expression Recognition (FER) are based on very deep neural networks that are effective but rather expensive to train.
We formalize the FaceChannel, a light-weight neural network that has much fewer parameters than common deep neural networks.
We demonstrate how our model achieves a comparable, if not better, performance to the current state-of-the-art in FER.
arXiv Detail & Related papers (2020-09-15T09:25:37Z) - EmotiCon: Context-Aware Multimodal Emotion Recognition using Frege's
Principle [71.47160118286226]
We present EmotiCon, a learning-based algorithm for context-aware perceived human emotion recognition from videos and images.
Motivated by Frege's Context Principle from psychology, our approach combines three interpretations of context for emotion recognition.
We report an Average Precision (AP) score of 35.48 across 26 classes, which is an improvement of 7-8 over prior methods.
arXiv Detail & Related papers (2020-03-14T19:55:21Z) - Learning to Augment Expressions for Few-shot Fine-grained Facial
Expression Recognition [98.83578105374535]
We present a novel Fine-grained Facial Expression Database - F2ED.
It includes more than 200k images with 54 facial expressions from 119 persons.
Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we evaluate several tasks of few-shot expression learning.
We propose a unified task-driven framework - Compositional Generative Adversarial Network (Comp-GAN) learning to synthesize facial images.
arXiv Detail & Related papers (2020-01-17T03:26:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.