Real-time Facial Expression Recognition "In The Wild'' by Disentangling
3D Expression from Identity
- URL: http://arxiv.org/abs/2005.05509v1
- Date: Tue, 12 May 2020 01:32:55 GMT
- Title: Real-time Facial Expression Recognition "In The Wild'' by Disentangling
3D Expression from Identity
- Authors: Mohammad Rami Koujan, Luma Alharbawee, Giorgos Giannakakis, Nicolas
Pugeault, Anastasios Roussos
- Abstract summary: This paper proposes a novel method for human emotion recognition from a single RGB image.
We construct a large-scale dataset of facial videos, rich in facial dynamics, identities, expressions, appearance and 3D pose variations.
Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation.
- Score: 6.974241731162878
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Human emotions analysis has been the focus of many studies, especially in the
field of Affective Computing, and is important for many applications, e.g.
human-computer intelligent interaction, stress analysis, interactive games,
animations, etc. Solutions for automatic emotion analysis have also benefited
from the development of deep learning approaches and the availability of vast
amount of visual facial data on the internet. This paper proposes a novel
method for human emotion recognition from a single RGB image. We construct a
large-scale dataset of facial videos (\textbf{FaceVid}), rich in facial
dynamics, identities, expressions, appearance and 3D pose variations. We use
this dataset to train a deep Convolutional Neural Network for estimating
expression parameters of a 3D Morphable Model and combine it with an effective
back-end emotion classifier. Our proposed framework runs at 50 frames per
second and is capable of robustly estimating parameters of 3D expression
variation and accurately recognizing facial expressions from in-the-wild
images. We present extensive experimental evaluation that shows that the
proposed method outperforms the compared techniques in estimating the 3D
expression parameters and achieves state-of-the-art performance in recognising
the basic emotions from facial images, as well as recognising stress from
facial videos. %compared to the current state of the art in emotion recognition
from facial images.
Related papers
- Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description [3.52270271101496]
"Emo3D" is an extensive "Text-Image-Expression dataset" spanning a wide spectrum of human emotions.
We generate a diverse array of textual descriptions, facilitating the capture of a broad spectrum of emotional expressions.
"Emo3D" has great applications in animation design, virtual reality, and emotional human-computer interaction.
arXiv Detail & Related papers (2024-10-02T21:31:24Z) - Emotion Separation and Recognition from a Facial Expression by Generating the Poker Face with Vision Transformers [57.1091606948826]
We propose a novel FER model, named Poker Face Vision Transformer or PF-ViT, to address these challenges.
PF-ViT aims to separate and recognize the disturbance-agnostic emotion from a static facial image via generating its corresponding poker face.
PF-ViT utilizes vanilla Vision Transformers, and its components are pre-trained as Masked Autoencoders on a large facial expression dataset.
arXiv Detail & Related papers (2022-07-22T13:39:06Z) - EMOCA: Emotion Driven Monocular Face Capture and Animation [59.15004328155593]
We introduce a novel deep perceptual emotion consistency loss during training, which helps ensure that the reconstructed 3D expression matches the expression depicted in the input image.
On the task of in-the-wild emotion recognition, our purely geometric approach is on par with the best image-based methods, highlighting the value of 3D geometry in analyzing human behavior.
arXiv Detail & Related papers (2022-04-24T15:58:35Z) - Neural Emotion Director: Speech-preserving semantic control of facial
expressions in "in-the-wild" videos [31.746152261362777]
We introduce a novel deep learning method for photo-realistic manipulation of the emotional state of actors in "in-the-wild" videos.
The proposed method is based on a parametric 3D face representation of the actor in the input scene that offers a reliable disentanglement of the facial identity from the head pose and facial expressions.
It then uses a novel deep domain translation framework that alters the facial expressions in a consistent and plausible manner, taking into account their dynamics.
arXiv Detail & Related papers (2021-12-01T15:55:04Z) - Deep Semantic Manipulation of Facial Videos [5.048861360606916]
This paper proposes the first method to perform photorealistic manipulation of facial expressions in videos.
Our method supports semantic video manipulation based on neural rendering and 3D-based facial expression modelling.
The proposed method is based on a disentangled representation and estimation of the 3D facial shape and activity.
arXiv Detail & Related papers (2021-11-15T16:55:16Z) - Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow.
Our framework was trained and tested on two very large-scale facial video datasets.
Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z) - Learning to Augment Expressions for Few-shot Fine-grained Facial
Expression Recognition [98.83578105374535]
We present a novel Fine-grained Facial Expression Database - F2ED.
It includes more than 200k images with 54 facial expressions from 119 persons.
Considering the phenomenon of uneven data distribution and lack of samples is common in real-world scenarios, we evaluate several tasks of few-shot expression learning.
We propose a unified task-driven framework - Compositional Generative Adversarial Network (Comp-GAN) learning to synthesize facial images.
arXiv Detail & Related papers (2020-01-17T03:26:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.