Facial Expression Video Generation Based-On Spatio-temporal
Convolutional GAN: FEV-GAN
- URL: http://arxiv.org/abs/2210.11182v1
- Date: Thu, 20 Oct 2022 11:54:32 GMT
- Title: Facial Expression Video Generation Based-On Spatio-temporal
Convolutional GAN: FEV-GAN
- Authors: Hamza Bouzid, Lahoucine Ballihi
- Abstract summary: We present a novel approach for generating videos of the six basic facial expressions.
Our approach is based on Spatio-temporal Conal GANs, that are known to model both content and motion in the same network.
The code and the pre-trained model will soon be made publicly available.
- Score: 1.279257604152629
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Facial expression generation has always been an intriguing task for
scientists and researchers all over the globe. In this context, we present our
novel approach for generating videos of the six basic facial expressions.
Starting from a single neutral facial image and a label indicating the desired
facial expression, we aim to synthesize a video of the given identity
performing the specified facial expression. Our approach, referred to as
FEV-GAN (Facial Expression Video GAN), is based on Spatio-temporal
Convolutional GANs, that are known to model both content and motion in the same
network. Previous methods based on such a network have shown a good ability to
generate coherent videos with smooth temporal evolution. However, they still
suffer from low image quality and low identity preservation capability. In this
work, we address this problem by using a generator composed of two image
encoders. The first one is pre-trained for facial identity feature extraction
and the second for spatial feature extraction. We have qualitatively and
quantitatively evaluated our model on two international facial expression
benchmark databases: MUG and Oulu-CASIA NIR&VIS. The experimental results
analysis demonstrates the effectiveness of our approach in generating videos of
the six basic facial expressions while preserving the input identity. The
analysis also proves that the use of both identity and spatial features
enhances the decoder ability to better preserve the identity and generate
high-quality videos. The code and the pre-trained model will soon be made
publicly available.
Related papers
- OSDFace: One-Step Diffusion Model for Face Restoration [72.5045389847792]
Diffusion models have demonstrated impressive performance in face restoration.
We propose OSDFace, a novel one-step diffusion model for face restoration.
Results demonstrate that OSDFace surpasses current state-of-the-art (SOTA) methods in both visual quality and quantitative metrics.
arXiv Detail & Related papers (2024-11-26T07:07:48Z) - G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - G2Face: High-Fidelity Reversible Face Anonymization via Generative and Geometric Priors [71.69161292330504]
Reversible face anonymization seeks to replace sensitive identity information in facial images with synthesized alternatives.
This paper introduces Gtextsuperscript2Face, which leverages both generative and geometric priors to enhance identity manipulation.
Our method outperforms existing state-of-the-art techniques in face anonymization and recovery, while preserving high data utility.
arXiv Detail & Related papers (2024-08-18T12:36:47Z) - Identity-Preserving Talking Face Generation with Landmark and Appearance
Priors [106.79923577700345]
Existing person-generic methods have difficulty in generating realistic and lip-synced videos.
We propose a two-stage framework consisting of audio-to-landmark generation and landmark-to-video rendering procedures.
Our method can produce more realistic, lip-synced, and identity-preserving videos than existing person-generic talking face generation methods.
arXiv Detail & Related papers (2023-05-15T01:31:32Z) - StyleFaceV: Face Video Generation via Decomposing and Recomposing
Pretrained StyleGAN3 [43.43545400625567]
We propose a principled framework named StyleFaceV, which produces high-fidelity identity-preserving face videos with vivid movements.
Our core insight is to decompose appearance and pose information and recompose them in the latent space of StyleGAN3 to produce stable and dynamic results.
arXiv Detail & Related papers (2022-08-16T17:47:03Z) - Image-to-Video Generation via 3D Facial Dynamics [78.01476554323179]
We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
arXiv Detail & Related papers (2021-05-31T02:30:11Z) - Continuous Emotion Recognition with Spatiotemporal Convolutional Neural
Networks [82.54695985117783]
We investigate the suitability of state-of-the-art deep learning architectures for continuous emotion recognition using long video sequences captured in-the-wild.
We have developed and evaluated convolutional recurrent neural networks combining 2D-CNNs and long short term-memory units, and inflated 3D-CNN models, which are built by inflating the weights of a pre-trained 2D-CNN model during fine-tuning.
arXiv Detail & Related papers (2020-11-18T13:42:05Z) - Video-based Facial Expression Recognition using Graph Convolutional
Networks [57.980827038988735]
We introduce a Graph Convolutional Network (GCN) layer into a common CNN-RNN based model for video-based facial expression recognition.
We evaluate our method on three widely-used datasets, CK+, Oulu-CASIA and MMI, and also one challenging wild dataset AFEW8.0.
arXiv Detail & Related papers (2020-10-26T07:31:51Z) - Synthetic Expressions are Better Than Real for Learning to Detect Facial
Actions [4.4532095214807965]
Our approach reconstructs the 3D shape of the face from each video frame, aligns the 3D mesh to a canonical view, and then trains a GAN-based network to synthesize novel images with facial action units of interest.
The network trained on synthesized facial expressions outperformed the one trained on actual facial expressions and surpassed current state-of-the-art approaches.
arXiv Detail & Related papers (2020-10-21T13:11:45Z) - Head2Head++: Deep Facial Attributes Re-Targeting [6.230979482947681]
We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment.
We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos.
Our system performs end-to-end reenactment in nearly real-time speed (18 fps)
arXiv Detail & Related papers (2020-06-17T23:38:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.