eMotion-GAN: A Motion-based GAN for Photorealistic and Facial Expression Preserving Frontal View Synthesis
- URL: http://arxiv.org/abs/2404.09940v1
- Date: Mon, 15 Apr 2024 17:08:53 GMT
- Title: eMotion-GAN: A Motion-based GAN for Photorealistic and Facial Expression Preserving Frontal View Synthesis
- Authors: Omar Ikne, Benjamin Allaert, Ioan Marius Bilasco, Hazem Wannous,
- Abstract summary: We present eMotion-GAN, a novel deep learning approach designed for frontal view synthesis.
Considering the motion induced by head variation as noise and the motion induced by facial expression as the relevant information, our model is trained to filter out the noisy motion.
The filtered motion is then mapped onto a neutral frontal face to generate the corresponding expressive frontal face.
- Score: 3.2498796510544636
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Many existing facial expression recognition (FER) systems encounter substantial performance degradation when faced with variations in head pose. Numerous frontalization methods have been proposed to enhance these systems' performance under such conditions. However, they often introduce undesirable deformations, rendering them less suitable for precise facial expression analysis. In this paper, we present eMotion-GAN, a novel deep learning approach designed for frontal view synthesis while preserving facial expressions within the motion domain. Considering the motion induced by head variation as noise and the motion induced by facial expression as the relevant information, our model is trained to filter out the noisy motion in order to retain only the motion related to facial expression. The filtered motion is then mapped onto a neutral frontal face to generate the corresponding expressive frontal face. We conducted extensive evaluations using several widely recognized dynamic FER datasets, which encompass sequences exhibiting various degrees of head pose variations in both intensity and orientation. Our results demonstrate the effectiveness of our approach in significantly reducing the FER performance gap between frontal and non-frontal faces. Specifically, we achieved a FER improvement of up to +5\% for small pose variations and up to +20\% improvement for larger pose variations. Code available at \url{https://github.com/o-ikne/eMotion-GAN.git}.
Related papers
- GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations [54.94362657501809]
We propose a new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time.
At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements.
We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework.
arXiv Detail & Related papers (2024-09-18T13:05:43Z) - Realistic and Efficient Face Swapping: A Unified Approach with Diffusion Models [69.50286698375386]
We propose a novel approach that better harnesses diffusion models for face-swapping.
We introduce a mask shuffling technique during inpainting training, which allows us to create a so-called universal model for swapping.
Ours is a relatively unified approach and so it is resilient to errors in other off-the-shelf models.
arXiv Detail & Related papers (2024-09-11T13:43:53Z) - TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting [21.474938045227702]
We introduce TalkingGaussian, a deformation-based radiance fields framework for high-fidelity talking head synthesis.
Our method renders high-quality lip-synchronized talking head videos, with better facial fidelity and higher efficiency compared with previous methods.
arXiv Detail & Related papers (2024-04-23T17:55:07Z) - From Static to Dynamic: Adapting Landmark-Aware Image Models for Facial Expression Recognition in Videos [88.08209394979178]
Dynamic facial expression recognition (DFER) in the wild is still hindered by data limitations.
We introduce a novel Static-to-Dynamic model (S2D) that leverages existing SFER knowledge and dynamic information implicitly encoded in extracted facial landmark-aware features.
arXiv Detail & Related papers (2023-12-09T03:16:09Z) - GaFET: Learning Geometry-aware Facial Expression Translation from
In-The-Wild Images [55.431697263581626]
We introduce a novel Geometry-aware Facial Expression Translation framework, which is based on parametric 3D facial representations and can stably decoupled expression.
We achieve higher-quality and more accurate facial expression transfer results compared to state-of-the-art methods, and demonstrate applicability of various poses and complex textures.
arXiv Detail & Related papers (2023-08-07T09:03:35Z) - One-Shot High-Fidelity Talking-Head Synthesis with Deformable Neural
Radiance Field [81.07651217942679]
Talking head generation aims to generate faces that maintain the identity information of the source image and imitate the motion of the driving image.
We propose HiDe-NeRF, which achieves high-fidelity and free-view talking-head synthesis.
arXiv Detail & Related papers (2023-04-11T09:47:35Z) - Expression-preserving face frontalization improves visually assisted
speech processing [35.647888055229956]
The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations.
We show that the method, when incorporated into deep learning pipelines, improves word recognition and speech intelligibilty scores by a considerable margin.
arXiv Detail & Related papers (2022-04-06T13:22:24Z) - PoseFace: Pose-Invariant Features and Pose-Adaptive Loss for Face
Recognition [42.62320574369969]
We propose an efficient PoseFace framework which utilizes the facial landmarks to disentangle the pose-invariant features and exploits a pose-adaptive loss to handle the imbalance issue adaptively.
arXiv Detail & Related papers (2021-07-25T03:50:47Z) - A NIR-to-VIS face recognition via part adaptive and relation attention
module [4.822208985805956]
In the face recognition application scenario, we need to process facial images captured in various conditions, such as at night by near-infrared (NIR) surveillance cameras.
The illumination difference between NIR and visible-light (VIS) causes a domain gap between facial images, and the variations in pose and emotion also make facial matching more difficult.
We propose a part relation attention module that crops facial parts obtained through a semantic mask and performs relational modeling using each of these representative features.
arXiv Detail & Related papers (2021-02-01T08:13:39Z) - InterFaceGAN: Interpreting the Disentangled Face Representation Learned
by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models.
We first find that GANs learn various semantics in some linear subspaces of the latent space.
We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.