PIRenderer: Controllable Portrait Image Generation via Semantic Neural
Rendering
- URL: http://arxiv.org/abs/2109.08379v1
- Date: Fri, 17 Sep 2021 07:24:16 GMT
- Title: PIRenderer: Controllable Portrait Image Generation via Semantic Neural
Rendering
- Authors: Yurui Ren and Ge Li and Yuanqi Chen and Thomas H. Li and Shan Liu
- Abstract summary: A Portrait Image Neural Renderer is proposed to control the face motions with the parameters of three-dimensional morphable face models.
The proposed model can generate photo-realistic portrait images with accurate movements according to intuitive modifications.
Our model can generate coherent videos with convincing movements from only a single reference image and a driving audio stream.
- Score: 56.762094966235566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generating portrait images by controlling the motions of existing faces is an
important task of great consequence to social media industries. For easy use
and intuitive control, semantically meaningful and fully disentangled
parameters should be used as modifications. However, many existing techniques
do not provide such fine-grained controls or use indirect editing methods i.e.
mimic motions of other individuals. In this paper, a Portrait Image Neural
Renderer (PIRenderer) is proposed to control the face motions with the
parameters of three-dimensional morphable face models (3DMMs). The proposed
model can generate photo-realistic portrait images with accurate movements
according to intuitive modifications. Experiments on both direct and indirect
editing tasks demonstrate the superiority of this model. Meanwhile, we further
extend this model to tackle the audio-driven facial reenactment task by
extracting sequential motions from audio inputs. We show that our model can
generate coherent videos with convincing movements from only a single reference
image and a driving audio stream. Our source code is available at
https://github.com/RenYurui/PIRender.
Related papers
- G3FA: Geometry-guided GAN for Face Animation [14.488117084637631]
We introduce Geometry-guided GAN for Face Animation (G3FA) to tackle this limitation.
Our novel approach empowers the face animation model to incorporate 3D information using only 2D images.
In our face reenactment model, we leverage 2D motion warping to capture motion dynamics.
arXiv Detail & Related papers (2024-08-23T13:13:24Z) - GSmoothFace: Generalized Smooth Talking Face Generation via Fine Grained
3D Face Guidance [83.43852715997596]
GSmoothFace is a novel two-stage generalized talking face generation model guided by a fine-grained 3d face model.
It can synthesize smooth lip dynamics while preserving the speaker's identity.
Both quantitative and qualitative experiments confirm the superiority of our method in terms of realism, lip synchronization, and visual quality.
arXiv Detail & Related papers (2023-12-12T16:00:55Z) - High-Fidelity and Freely Controllable Talking Head Video Generation [31.08828907637289]
We propose a novel model that produces high-fidelity talking head videos with free control over head pose and expression.
We introduce a novel motion-aware multi-scale feature alignment module to effectively transfer the motion without face distortion.
We evaluate our model on challenging datasets and demonstrate its state-of-the-art performance.
arXiv Detail & Related papers (2023-04-20T09:02:41Z) - Dynamic Neural Portraits [58.480811535222834]
We present Dynamic Neural Portraits, a novel approach to the problem of full-head reenactment.
Our method generates photo-realistic video portraits by explicitly controlling head pose, facial expressions and eye gaze.
Our experiments demonstrate that the proposed method is 270 times faster than recent NeRF-based reenactment methods.
arXiv Detail & Related papers (2022-11-25T10:06:14Z) - Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z) - Semantic-Aware Implicit Neural Audio-Driven Video Portrait Generation [61.8546794105462]
We propose Semantic-aware Speaking Portrait NeRF (SSP-NeRF), which creates delicate audio-driven portraits using one unified set of NeRF.
We first propose a Semantic-Aware Dynamic Ray Sampling module with an additional parsing branch that facilitates audio-driven volume rendering.
To enable portrait rendering in one unified neural radiance field, a Torso Deformation module is designed to stabilize the large-scale non-rigid torso motions.
arXiv Detail & Related papers (2022-01-19T18:54:41Z) - FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
Manifold [5.462226912969161]
Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images.
We show how our approach enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines.
Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates.
arXiv Detail & Related papers (2021-09-20T08:59:21Z) - FLAME-in-NeRF : Neural control of Radiance Fields for Free View Face
Animation [37.39945646282971]
This paper presents a neural rendering method for controllable portrait video synthesis.
We leverage the expression space of a 3D morphable face model (3DMM) to represent the distribution of human facial expressions.
We demonstrate the effectiveness of our method on free view synthesis of portrait videos with photorealistic expression controls.
arXiv Detail & Related papers (2021-08-10T20:41:15Z) - Pose-Controllable Talking Face Generation by Implicitly Modularized
Audio-Visual Representation [96.66010515343106]
We propose a clean yet effective framework to generate pose-controllable talking faces.
We operate on raw face images, using only a single photo as an identity reference.
Our model has multiple advanced capabilities including extreme view robustness and talking face frontalization.
arXiv Detail & Related papers (2021-04-22T15:10:26Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.