Head2Head: Video-based Neural Head Synthesis
- URL: http://arxiv.org/abs/2005.10954v1
- Date: Fri, 22 May 2020 00:44:43 GMT
- Title: Head2Head: Video-based Neural Head Synthesis
- Authors: Mohammad Rami Koujan, Michail Christos Doukas, Anastasios Roussos,
Stefanos Zafeiriou
- Abstract summary: We propose a novel machine learning architecture for facial reenactment.
We show that the proposed method can transfer facial expressions, pose and gaze of a source actor to a target video in a photo-realistic fashion more accurately than state-of-the-art methods.
- Score: 50.32988828989691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a novel machine learning architecture for facial
reenactment. In particular, contrary to the model-based approaches or recent
frame-based methods that use Deep Convolutional Neural Networks (DCNNs) to
generate individual frames, we propose a novel method that (a) exploits the
special structure of facial motion (paying particular attention to mouth
motion) and (b) enforces temporal consistency. We demonstrate that the proposed
method can transfer facial expressions, pose and gaze of a source actor to a
target video in a photo-realistic fashion more accurately than state-of-the-art
methods.
Related papers
- DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face Reenactment [34.821255203019554]
Video-driven neural face reenactment aims to synthesize realistic facial images that successfully preserve the identity and appearance of a source face.
Recent advances in Diffusion Probabilistic Models (DPMs) enable the generation of high-quality realistic images.
We present Diffusion, a novel method that leverages the photo-realistic image generation of diffusion models to perform neural face reenactment.
arXiv Detail & Related papers (2024-03-25T21:46:53Z) - One-shot Neural Face Reenactment via Finding Directions in GAN's Latent
Space [37.357842761713705]
We present a framework for neural face/head reenactment whose goal is to transfer the 3D head orientation and expression of a target face to a source face.
Our method features several favorable properties including using a single source image (one-shot) and enabling cross-person reenactment.
arXiv Detail & Related papers (2024-02-05T22:12:42Z) - Diffusion Priors for Dynamic View Synthesis from Monocular Videos [59.42406064983643]
Dynamic novel view synthesis aims to capture the temporal evolution of visual content within videos.
We first finetune a pretrained RGB-D diffusion model on the video frames using a customization technique.
We distill the knowledge from the finetuned model to a 4D representations encompassing both dynamic and static Neural Radiance Fields.
arXiv Detail & Related papers (2024-01-10T23:26:41Z) - HyperReenact: One-Shot Reenactment via Jointly Learning to Refine and
Retarget Faces [47.27033282706179]
We present our method for neural face reenactment, called HyperReenact, that aims to generate realistic talking head images of a source identity.
Our method operates under the one-shot setting (i.e., using a single source frame) and allows for cross-subject reenactment, without requiring subject-specific fine-tuning.
We compare our method both quantitatively and qualitatively against several state-of-the-art techniques on the standard benchmarks of VoxCeleb1 and VoxCeleb2.
arXiv Detail & Related papers (2023-07-20T11:59:42Z) - High-fidelity Facial Avatar Reconstruction from Monocular Video with
Generative Priors [29.293166730794606]
We propose a new method for NeRF-based facial avatar reconstruction that utilizes 3D-aware generative prior.
Compared with existing works, we obtain superior novel view synthesis results and faithfully face reenactment performance.
arXiv Detail & Related papers (2022-11-28T04:49:46Z) - Dynamic Neural Portraits [58.480811535222834]
We present Dynamic Neural Portraits, a novel approach to the problem of full-head reenactment.
Our method generates photo-realistic video portraits by explicitly controlling head pose, facial expressions and eye gaze.
Our experiments demonstrate that the proposed method is 270 times faster than recent NeRF-based reenactment methods.
arXiv Detail & Related papers (2022-11-25T10:06:14Z) - UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video
Editing [78.26925404508994]
We propose a unified temporally consistent facial video editing framework termed UniFaceGAN.
Our framework is designed to handle face swapping and face reenactment simultaneously.
Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2021-08-12T10:35:22Z) - Head2Head++: Deep Facial Attributes Re-Targeting [6.230979482947681]
We leverage the 3D geometry of faces and Generative Adversarial Networks (GANs) to design a novel deep learning architecture for the task of facial and head reenactment.
We manage to capture the complex non-rigid facial motion from the driving monocular performances and synthesise temporally consistent videos.
Our system performs end-to-end reenactment in nearly real-time speed (18 fps)
arXiv Detail & Related papers (2020-06-17T23:38:37Z) - Neural Human Video Rendering by Learning Dynamic Textures and
Rendering-to-Video Translation [99.64565200170897]
We propose a novel human video synthesis method by explicitly disentangling the learning of time-coherent fine-scale details from the embedding of the human in 2D screen space.
We show several applications of our approach, such as human reenactment and novel view synthesis from monocular video, where we show significant improvement over the state of the art both qualitatively and quantitatively.
arXiv Detail & Related papers (2020-01-14T18:06:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.