StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single
Video
- URL: http://arxiv.org/abs/2305.00942v1
- Date: Mon, 1 May 2023 16:54:35 GMT
- Title: StyleAvatar: Real-time Photo-realistic Portrait Avatar from a Single
Video
- Authors: Lizhen Wang, Xiaochen Zhao, Jingxiang Sun, Yuxiang Zhang, Hongwen
Zhang, Tao Yu, Yebin Liu
- Abstract summary: StyleAvatar is a real-time photo-realistic portrait avatar reconstruction method using StyleGAN-based networks.
Results and experiments demonstrate the superiority of our method in terms of image quality, full portrait video generation, and real-time re-animation.
- Score: 39.176852832054045
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Face reenactment methods attempt to restore and re-animate portrait videos as
realistically as possible. Existing methods face a dilemma in quality versus
controllability: 2D GAN-based methods achieve higher image quality but suffer
in fine-grained control of facial attributes compared with 3D counterparts. In
this work, we propose StyleAvatar, a real-time photo-realistic portrait avatar
reconstruction method using StyleGAN-based networks, which can generate
high-fidelity portrait avatars with faithful expression control. We expand the
capabilities of StyleGAN by introducing a compositional representation and a
sliding window augmentation method, which enable faster convergence and improve
translation generalization. Specifically, we divide the portrait scenes into
three parts for adaptive adjustments: facial region, non-facial foreground
region, and the background. Besides, our network leverages the best of UNet,
StyleGAN and time coding for video learning, which enables high-quality video
generation. Furthermore, a sliding window augmentation method together with a
pre-training strategy are proposed to improve translation generalization and
training performance, respectively. The proposed network can converge within
two hours while ensuring high image quality and a forward rendering time of
only 20 milliseconds. Furthermore, we propose a real-time live system, which
further pushes research into applications. Results and experiments demonstrate
the superiority of our method in terms of image quality, full portrait video
generation, and real-time re-animation compared to existing facial reenactment
methods. Training and inference code for this paper are at
https://github.com/LizhenWangT/StyleAvatar.
Related papers
- TextToon: Real-Time Text Toonify Head Avatar from Single Video [34.07760625281835]
We propose TextToon, a method to generate a drivable toonified avatar.
Given a short monocular video sequence and a written instruction about the avatar style, our model can generate a high-fidelity toonified avatar.
arXiv Detail & Related papers (2024-09-23T15:04:45Z) - G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - Dynamic Neural Portraits [58.480811535222834]
We present Dynamic Neural Portraits, a novel approach to the problem of full-head reenactment.
Our method generates photo-realistic video portraits by explicitly controlling head pose, facial expressions and eye gaze.
Our experiments demonstrate that the proposed method is 270 times faster than recent NeRF-based reenactment methods.
arXiv Detail & Related papers (2022-11-25T10:06:14Z) - Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z) - StyleFaceV: Face Video Generation via Decomposing and Recomposing
Pretrained StyleGAN3 [43.43545400625567]
We propose a principled framework named StyleFaceV, which produces high-fidelity identity-preserving face videos with vivid movements.
Our core insight is to decompose appearance and pose information and recompose them in the latent space of StyleGAN3 to produce stable and dynamic results.
arXiv Detail & Related papers (2022-08-16T17:47:03Z) - Video2StyleGAN: Encoding Video in Latent Space for Manipulation [63.03250800510085]
We propose a novel network to encode face videos into the latent space of StyleGAN for semantic face video manipulation.
Our approach can significantly outperform existing single image methods, while achieving real-time (66 fps) speed.
arXiv Detail & Related papers (2022-06-27T06:48:15Z) - StyleHEAT: One-Shot High-Resolution Editable Talking Face Generation via
Pretrained StyleGAN [49.917296433657484]
One-shot talking face generation aims at synthesizing a high-quality talking face video from an arbitrary portrait image.
In this work, we investigate the latent feature space of a pre-trained StyleGAN and discover some excellent spatial transformation properties.
We propose a novel unified framework based on a pre-trained StyleGAN that enables a set of powerful functionalities.
arXiv Detail & Related papers (2022-03-08T12:06:12Z) - UniFaceGAN: A Unified Framework for Temporally Consistent Facial Video
Editing [78.26925404508994]
We propose a unified temporally consistent facial video editing framework termed UniFaceGAN.
Our framework is designed to handle face swapping and face reenactment simultaneously.
Compared with the state-of-the-art facial image editing methods, our framework generates video portraits that are more photo-realistic and temporally smooth.
arXiv Detail & Related papers (2021-08-12T10:35:22Z) - Pixel Codec Avatars [99.36561532588831]
Pixel Codec Avatars (PiCA) is a deep generative model of 3D human faces.
On a single Oculus Quest 2 mobile VR headset, 5 avatars are rendered in realtime in the same scene.
arXiv Detail & Related papers (2021-04-09T23:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.