Image-to-Video Generation via 3D Facial Dynamics
- URL: http://arxiv.org/abs/2105.14678v1
- Date: Mon, 31 May 2021 02:30:11 GMT
- Title: Image-to-Video Generation via 3D Facial Dynamics
- Authors: Xiaoguang Tu, Yingtian Zou, Jian Zhao, Wenjie Ai, Jian Dong, Yuan Yao,
Zhikang Wang, Guodong Guo, Zhifeng Li, Wei Liu, and Jiashi Feng
- Abstract summary: We present a versatile model, FaceAnime, for various video generation tasks from still images.
Our model is versatile for various AR/VR and entertainment applications, such as face video and face video prediction.
- Score: 78.01476554323179
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a versatile model, FaceAnime, for various video generation tasks
from still images. Video generation from a single face image is an interesting
problem and usually tackled by utilizing Generative Adversarial Networks (GANs)
to integrate information from the input face image and a sequence of sparse
facial landmarks. However, the generated face images usually suffer from
quality loss, image distortion, identity change, and expression mismatching due
to the weak representation capacity of the facial landmarks. In this paper, we
propose to "imagine" a face video from a single face image according to the
reconstructed 3D face dynamics, aiming to generate a realistic and
identity-preserving face video, with precisely predicted pose and facial
expression. The 3D dynamics reveal changes of the facial expression and motion,
and can serve as a strong prior knowledge for guiding highly realistic face
video generation. In particular, we explore face video prediction and exploit a
well-designed 3D dynamic prediction network to predict a 3D dynamic sequence
for a single face image. The 3D dynamics are then further rendered by the
sparse texture mapping algorithm to recover structural details and sparse
textures for generating face frames. Our model is versatile for various AR/VR
and entertainment applications, such as face video retargeting and face video
prediction. Superior experimental results have well demonstrated its
effectiveness in generating high-fidelity, identity-preserving, and visually
pleasant face video clips from a single source face image.
Related papers
- Single Image, Any Face: Generalisable 3D Face Generation [59.9369171926757]
We propose a novel model, Gen3D-Face, which generates 3D human faces with unconstrained single image input.
To the best of our knowledge, this is the first attempt and benchmark for creating photorealistic 3D human face avatars from single images.
arXiv Detail & Related papers (2024-09-25T14:56:37Z) - GaussianHeads: End-to-End Learning of Drivable Gaussian Head Avatars from Coarse-to-fine Representations [54.94362657501809]
We propose a new method to generate highly dynamic and deformable human head avatars from multi-view imagery in real-time.
At the core of our method is a hierarchical representation of head models that allows to capture the complex dynamics of facial expressions and head movements.
We train this coarse-to-fine facial avatar model along with the head pose as a learnable parameter in an end-to-end framework.
arXiv Detail & Related papers (2024-09-18T13:05:43Z) - G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - NeuFace: Realistic 3D Neural Face Rendering from Multi-view Images [18.489290898059462]
This paper presents a novel 3D face rendering model, namely NeuFace, to learn accurate and physically-meaningful underlying 3D representations.
We introduce an approximated BRDF integration and a simple yet new low-rank prior, which effectively lower the ambiguities and boost the performance of the facial BRDFs.
arXiv Detail & Related papers (2023-03-24T15:57:39Z) - StyleFaceV: Face Video Generation via Decomposing and Recomposing
Pretrained StyleGAN3 [43.43545400625567]
We propose a principled framework named StyleFaceV, which produces high-fidelity identity-preserving face videos with vivid movements.
Our core insight is to decompose appearance and pose information and recompose them in the latent space of StyleGAN3 to produce stable and dynamic results.
arXiv Detail & Related papers (2022-08-16T17:47:03Z) - Video2StyleGAN: Encoding Video in Latent Space for Manipulation [63.03250800510085]
We propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation.
Our approach can significantly outperform existing single image methods, while achieving real-time (66 fps) speed.
arXiv Detail & Related papers (2022-06-27T06:48:15Z) - SAFA: Structure Aware Face Animation [9.58882272014749]
We propose a structure aware face animation (SAFA) method which constructs specific geometric structures to model different components of a face image.
We use a 3D morphable model (3DMM) to model the face, multiple affine transforms to model the other foreground components like hair and beard, and an identity transform to model the background.
The 3DMM geometric embedding not only helps generate realistic structure for the driving scene, but also contributes to better perception of occluded area in the generated image.
arXiv Detail & Related papers (2021-11-09T03:22:38Z) - FaceDet3D: Facial Expressions with 3D Geometric Detail Prediction [62.5557724039217]
Facial expressions induce a variety of high-level details on the 3D face geometry.
Morphable Models (3DMMs) of the human face fail to capture such fine details in their PCA-based representations.
We introduce FaceDet3D, a first-of-its-kind method that generates - from a single image - geometric facial details consistent with any desired target expression.
arXiv Detail & Related papers (2020-12-14T23:07:38Z) - Head2Head++: Deep Facial Attributes Re-Targeting [6.230979482947681]
We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment.
We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos.
Our system performs end-to-end reenactment in nearly real-time speed (18 fps)
arXiv Detail & Related papers (2020-06-17T23:38:37Z) - DeepFaceFlow: In-the-wild Dense 3D Facial Motion Estimation [56.56575063461169]
DeepFaceFlow is a robust, fast, and highly-accurate framework for the estimation of 3D non-rigid facial flow.
Our framework was trained and tested on two very large-scale facial video datasets.
Given registered pairs of images, our framework generates 3D flow maps at 60 fps.
arXiv Detail & Related papers (2020-05-14T23:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.