FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
Manifold
- URL: http://arxiv.org/abs/2109.09378v1
- Date: Mon, 20 Sep 2021 08:59:21 GMT
- Title: FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
Manifold
- Authors: Thomas Leimk\"uhler, George Drettakis
- Abstract summary: Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images.
We show how our approach enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines.
Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates.
- Score: 5.462226912969161
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Current Generative Adversarial Networks (GANs) produce photorealistic
renderings of portrait images. Embedding real images into the latent space of
such models enables high-level image editing. While recent methods provide
considerable semantic control over the (re-)generated images, they can only
generate a limited set of viewpoints and cannot explicitly control the camera.
Such 3D camera control is required for 3D virtual and mixed reality
applications. In our solution, we use a few images of a face to perform 3D
reconstruction, and we introduce the notion of the GAN camera manifold, the key
element allowing us to precisely define the range of images that the GAN can
reproduce in a stable manner. We train a small face-specific neural implicit
representation network to map a captured face to this manifold and complement
it with a warping scheme to obtain free-viewpoint novel-view synthesis. We show
how our approach - due to its precise camera control - enables the integration
of a pre-trained StyleGAN into standard 3D rendering pipelines, allowing e.g.,
stereo rendering or consistent insertion of faces in synthetic 3D environments.
Our solution proposes the first truly free-viewpoint rendering of realistic
faces at interactive rates, using only a small number of casual photos as
input, while simultaneously allowing semantic editing capabilities, such as
facial expression or lighting changes.
Related papers
- DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis [18.64688172651478]
We present DiffPortrait3D, a conditional diffusion model capable of synthesizing 3D-consistent photo-realistic novel views.
Given a single RGB input, we aim to synthesize plausible but consistent facial details rendered from novel camera views.
We demonstrate state-of-the-art results both qualitatively and quantitatively on our challenging in-the-wild and multi-view benchmarks.
arXiv Detail & Related papers (2023-12-20T13:31:11Z) - Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance
Fields using Geometry-Guided Text-to-Image Diffusion Model [39.64952340472541]
We propose a controllable text-to-3D avatar generation method whose facial expression is controllable.
Our main strategy is to construct the 3D avatar in Neural Radiance Fields (NeRF) optimized with a set of controlled viewpoint-aware images.
We demonstrate the empirical results and discuss the effectiveness of our method.
arXiv Detail & Related papers (2023-09-07T08:14:46Z) - Single-Shot Implicit Morphable Faces with Consistent Texture
Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.
Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z) - 3D GAN Inversion with Pose Optimization [26.140281977885376]
We introduce a generalizable 3D GAN inversion method that infers camera viewpoint and latent code simultaneously to enable multi-view consistent semantic image editing.
We conduct extensive experiments on image reconstruction and editing both quantitatively and qualitatively, and further compare our results with 2D GAN-based editing.
arXiv Detail & Related papers (2022-10-13T19:06:58Z) - Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z) - Vision Transformer for NeRF-Based View Synthesis from a Single Input
Image [49.956005709863355]
We propose to leverage both the global and local features to form an expressive 3D representation.
To synthesize a novel view, we train a multilayer perceptron (MLP) network conditioned on the learned 3D representation to perform volume rendering.
Our method can render novel views from only a single input image and generalize across multiple object categories using a single model.
arXiv Detail & Related papers (2022-07-12T17:52:04Z) - MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation [69.35523133292389]
We propose a framework that a priori models physical attributes of the face explicitly, thus providing disentanglement by design.
Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models.
It achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
arXiv Detail & Related papers (2021-11-01T15:53:36Z) - PIRenderer: Controllable Portrait Image Generation via Semantic Neural
Rendering [56.762094966235566]
A Portrait Image Neural Renderer is proposed to control the face motions with the parameters of three-dimensional morphable face models.
The proposed model can generate photo-realistic portrait images with accurate movements according to intuitive modifications.
Our model can generate coherent videos with convincing movements from only a single reference image and a driving audio stream.
arXiv Detail & Related papers (2021-09-17T07:24:16Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z) - StyleRig: Rigging StyleGAN for 3D Control over Portrait Images [81.43265493604302]
StyleGAN generates portrait images of faces with eyes, teeth, hair and context (neck, shoulders, background)
StyleGAN lacks a rig-like control over semantic face parameters that are interpretable in 3D, such as face pose, expressions, and scene illumination.
We present the first method to provide a face rig-like control over a pretrained and fixed StyleGAN via a 3DMM.
arXiv Detail & Related papers (2020-03-31T21:20:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.