EMOCA: Emotion Driven Monocular Face Capture and Animation
- URL: http://arxiv.org/abs/2204.11312v1
- Date: Sun, 24 Apr 2022 15:58:35 GMT
- Title: EMOCA: Emotion Driven Monocular Face Capture and Animation
- Authors: Radek Danecek, Michael J. Black, Timo Bolkart
- Abstract summary: We introduce a novel deep perceptual emotion consistency loss during training, which helps ensure that the reconstructed 3D expression matches the expression depicted in the input image.
On the task of in-the-wild emotion recognition, our purely geometric approach is on par with the best image-based methods, highlighting the value of 3D geometry in analyzing human behavior.
- Score: 59.15004328155593
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As 3D facial avatars become more widely used for communication, it is
critical that they faithfully convey emotion. Unfortunately, the best recent
methods that regress parametric 3D face models from monocular images are unable
to capture the full spectrum of facial expression, such as subtle or extreme
emotions. We find the standard reconstruction metrics used for training
(landmark reprojection error, photometric error, and face recognition loss) are
insufficient to capture high-fidelity expressions. The result is facial
geometries that do not match the emotional content of the input image. We
address this with EMOCA (EMOtion Capture and Animation), by introducing a novel
deep perceptual emotion consistency loss during training, which helps ensure
that the reconstructed 3D expression matches the expression depicted in the
input image. While EMOCA achieves 3D reconstruction errors that are on par with
the current best methods, it significantly outperforms them in terms of the
quality of the reconstructed expression and the perceived emotional content. We
also directly regress levels of valence and arousal and classify basic
expressions from the estimated 3D face parameters. On the task of in-the-wild
emotion recognition, our purely geometric approach is on par with the best
image-based methods, highlighting the value of 3D geometry in analyzing human
behavior. The model and code are publicly available at
https://emoca.is.tue.mpg.de.
Related papers
- Emo3D: Metric and Benchmarking Dataset for 3D Facial Expression Generation from Emotion Description [3.52270271101496]
"Emo3D" is an extensive "Text-Image-Expression dataset" spanning a wide spectrum of human emotions.
We generate a diverse array of textual descriptions, facilitating the capture of a broad spectrum of emotional expressions.
"Emo3D" has great applications in animation design, virtual reality, and emotional human-computer interaction.
arXiv Detail & Related papers (2024-10-02T21:31:24Z) - 3D Facial Expressions through Analysis-by-Neural-Synthesis [30.2749903946587]
SMIRK (Spatial Modeling for Image-based Reconstruction of Kinesics) faithfully reconstructs expressive 3D faces from images.
We identify two key limitations in existing methods: shortcomings in their self-supervised training formulation, and a lack of expression diversity in the training images.
Our qualitative, quantitative and particularly our perceptual evaluations demonstrate that SMIRK achieves the new state-of-the art performance on accurate expression reconstruction.
arXiv Detail & Related papers (2024-04-05T14:00:07Z) - Controllable Dynamic Appearance for Neural 3D Portraits [54.29179484318194]
We propose CoDyNeRF, a system that enables the creation of fully controllable 3D portraits in real-world capture conditions.
CoDyNeRF learns to approximate illumination dependent effects via a dynamic appearance model.
We demonstrate the effectiveness of our method on free view synthesis of a portrait scene with explicit head pose and expression controls.
arXiv Detail & Related papers (2023-09-20T02:24:40Z) - Emotional Speech-Driven Animation with Content-Emotion Disentanglement [51.34635009347183]
We propose EMOTE, which generates 3D talking-head avatars that maintain lip-sync from speech while enabling explicit control over the expression of emotion.
EmOTE produces speech-driven facial animations with better lip-sync than state-of-the-art methods trained on the same data.
arXiv Detail & Related papers (2023-06-15T09:31:31Z) - EmoTalk: Speech-Driven Emotional Disentanglement for 3D Face Animation [28.964917860664492]
Speech-driven 3D face animation aims to generate realistic facial expressions that match the speech content and emotion.
This paper proposes an end-to-end neural network to disentangle different emotions in speech so as to generate rich 3D facial expressions.
Our approach outperforms state-of-the-art methods and exhibits more diverse facial movements.
arXiv Detail & Related papers (2023-03-20T13:22:04Z) - 3D-TalkEmo: Learning to Synthesize 3D Emotional Talking Head [13.305263646852087]
We introduce 3D-TalkEmo, a deep neural network that generates 3D talking head animation with various emotions.
We also create a large 3D dataset with synchronized audios and videos, rich corpus, as well as various emotion states of different persons.
arXiv Detail & Related papers (2021-04-25T02:48:19Z) - FaceDet3D: Facial Expressions with 3D Geometric Detail Prediction [62.5557724039217]
Facial expressions induce a variety of high-level details on the 3D face geometry.
Morphable Models (3DMMs) of the human face fail to capture such fine details in their PCA-based representations.
We introduce FaceDet3D, a first-of-its-kind method that generates - from a single image - geometric facial details consistent with any desired target expression.
arXiv Detail & Related papers (2020-12-14T23:07:38Z) - DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow.
Our framework was trained and tested on two very large-scale facial video datasets.
Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z) - Real-time Facial Expression Recognition "In The Wild'' by Disentangling
3D Expression from Identity [6.974241731162878]
This paper proposes a novel method for human emotion recognition from a single RGB image.
We construct a large-scale dataset of facial videos, rich in facial dynamics, identities, expressions, appearance and 3D pose variations.
Our proposed framework runs at 50 frames per second and is capable of robustly estimating parameters of 3D expression variation.
arXiv Detail & Related papers (2020-05-12T01:32:55Z) - Deep 3D Portrait from a Single Image [54.634207317528364]
We present a learning-based approach for recovering the 3D geometry of human head from a single portrait image.
A two-step geometry learning scheme is proposed to learn 3D head reconstruction from in-the-wild face images.
We evaluate the accuracy of our method both in 3D and with pose manipulation tasks on 2D images.
arXiv Detail & Related papers (2020-04-24T08:55:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.