Related papers: Real-time Facial Expression Recognition "In The Wild'' by Disentangling 3D Expression from Identity

Real-time Facial Expression Recognition "In The Wild'' by Disentangling 3D Expression from Identity

URL: http://arxiv.org/abs/2005.05509v1
Date: Tue, 12 May 2020 01:32:55 GMT
Title: Real-time Facial Expression Recognition "In The Wild'' by Disentangling 3D Expression from Identity
Authors: Mohammad Rami Koujan, Luma Alharbawee, Giorgos Giannakakis, Nicolas Pugeault, Anastasios Roussos
Abstract summary: This paper proposes a novel method for human emotion recognition from a single RGB image. We construct a large-scale dataset of facial videos, rich in facial dynamics, identities, expressions, appearance and 3D pose variations. Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation.
Score: 6.974241731162878
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human emotions analysis has been the focus of many studies, especially in the field of Affective Computing, and is important for many applications, e.g. human-computer intelligent interaction, stress analysis, interactive games, animations, etc. Solutions for automatic emotion analysis have also benefited from the development of deep learning approaches and the availability of vast amount of visual facial data on the internet. This paper proposes a novel method for human emotion recognition from a single RGB image. We construct a large-scale dataset of facial videos (\textbf{FaceVid}), rich in facial dynamics, identities, expressions, appearance and 3D pose variations. We use this dataset to train a deep Convolutional Neural Network for estimating expression parameters of a 3D Morphable Model and combine it with an effective back-end emotion classifier. Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation and accurately recognizing facial expressions from in-the-wild images. We present extensive experimental evaluation that shows that the proposed method outperforms the compared techniques in estimating the 3D expression parameters and achieves state-of-the-art performance in recognising the basic emotions from facial images, as well as recognising stress from facial videos. %compared to the current state of the art in emotion recognition from facial images.

Related papers

Facial Landmark Visualization and Emotion Recognition Through Neural Networks [0.0]
Emotion recognition from facial images is a crucial task in human-computer interaction.<n>Previous studies have shown that facial images can be used to train deep learning models.<n>We propose facial landmark box plots, a visualization technique designed to identify outliers in facial datasets.
arXiv Detail & Related papers (2025-06-20T17:45:34Z)
EmoDiffusion: Enhancing Emotional 3D Facial Animation with Latent Diffusion Models [66.67979602235015]
EmoDiffusion is a novel approach that disentangles different emotions in speech to generate rich 3D emotional facial expressions. We capture facial expressions under the guidance of animation experts using LiveLinkFace on an iPhone.
arXiv Detail & Related papers (2025-03-14T02:54:22Z)
Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description [3.52270271101496]
"Emo3D" is an extensive "Text-Image-Expression dataset" spanning a wide spectrum of human emotions. We generate a diverse array of textual descriptions, facilitating the capture of a broad spectrum of emotional expressions. "Emo3D" has great applications in animation design, virtual reality, and emotional human-computer interaction.
arXiv Detail & Related papers (2024-10-02T21:31:24Z)
Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers [57.1091606948826]
We propose a novel FER model, named Poker Face Vision Transformer or PF-ViT, to address these challenges. PF-ViT aims to separate and recognize the disturbance-agnostic emotion from a static facial image via generating its corresponding poker face. PF-ViT utilizes vanilla Vision Transformers, and its components are pre-trained as Masked Autoencoders on a large facial expression dataset.
arXiv Detail & Related papers (2022-07-22T13:39:06Z)
EMOCA: Emotion Driven Monocular Face Capture and Animation [59.15004328155593]
We introduce a novel deep perceptual emotion consistency loss during training, which helps ensure that the reconstructed 3D expression matches the expression depicted in the input image. On the task of in-the-wild emotion recognition, our purely geometric approach is on par with the best image-based methods, highlighting the value of 3D geometry in analyzing human behavior.
arXiv Detail & Related papers (2022-04-24T15:58:35Z)
Neural Emotion Director: Speech-preserving semantic control of facial expressions in "in-the-wild" videos [31.746152261362777]
We introduce a novel deep learning method for photo-realistic manipulation of the emotional state of actors in "in-the-wild" videos. The proposed method is based on a parametric 3D face representation of the actor in the input scene that offers a reliable disentanglement of the facial identity from the head pose and facial expressions. It then uses a novel deep domain translation framework that alters the facial expressions in a consistent and plausible manner, taking into account their dynamics.
arXiv Detail & Related papers (2021-12-01T15:55:04Z)
Deep Semantic Manipulation of Facial Videos [5.048861360606916]
This paper proposes the first method to perform photorealistic manipulation of facial expressions in videos. Our method supports semantic video manipulation based on neural rendering and 3D-based facial expression modelling. The proposed method is based on a disentangled representation and estimation of the 3D facial shape and activity.
arXiv Detail & Related papers (2021-11-15T16:55:16Z)
Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images. Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z)
Continuous Emotion Recognition with Spatiotemporal Convolutional Neural Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild. We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z)
DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow. Our framework was trained and tested on two very large-scale facial video datasets. Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z)
Learning to Augment Expressions for Few-shot Fine-grained Facial Expression Recognition [98.83578105374535]
We present a novel Fine-grained Facial Expression Database - F2ED. It includes more than 200k images with 54 facial expressions from 119 persons. Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we evaluate several tasks of few-shot expression learning. We propose a unified task-driven framework - Compositional Generative Adversarial Network (Comp-GAN) learning to synthesize facial images.
arXiv Detail & Related papers (2020-01-17T03:26:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.