Explicitly Controllable 3D-Aware Portrait Generation
- URL: http://arxiv.org/abs/2209.05434v1
- Date: Mon, 12 Sep 2022 17:40:08 GMT
- Title: Explicitly Controllable 3D-Aware Portrait Generation
- Authors: Junshu Tang, Bo Zhang, Binxin Yang, Ting Zhang, Dong Chen, Lizhuang
Ma, Fang Wen
- Abstract summary: We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
- Score: 42.30481422714532
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In contrast to the traditional avatar creation pipeline which is a costly
process, contemporary generative approaches directly learn the data
distribution from photographs and the state of the arts can now yield highly
photo-realistic images. While plenty of works attempt to extend the
unconditional generative models and achieve some level of controllability, it
is still challenging to ensure multi-view consistency, especially in large
poses. In this work, we propose a 3D portrait generation network that produces
3D consistent portraits while being controllable according to semantic
parameters regarding pose, identity, expression and lighting. The generative
network uses neural scene representation to model portraits in 3D, whose
generation is guided by a parametric face model that supports explicit control.
While the latent disentanglement can be further enhanced by contrasting images
with partially different attributes, there still exists noticeable
inconsistency in non-face areas, e.g., hair and background, when animating
expressions. We solve this by proposing a volume blending strategy in which we
form a composite output by blending the dynamic and static radiance fields,
with two parts segmented from the jointly learned semantic field. Our method
outperforms prior arts in extensive experiments, producing realistic portraits
with vivid expression in natural lighting when viewed in free viewpoint. The
proposed method also demonstrates generalization ability to real images as well
as out-of-domain cartoon faces, showing great promise in real applications.
Additional video results and code will be available on the project webpage.
Related papers
- DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis [18.64688172651478]
We present DiffPortrait3D, a conditional diffusion model capable of synthesizing 3D-consistent photo-realistic novel views.
Given a single RGB input, we aim to synthesize plausible but consistent facial details rendered from novel camera views.
We demonstrate state-of-the-art results both qualitatively and quantitatively on our challenging in-the-wild and multi-view benchmarks.
arXiv Detail & Related papers (2023-12-20T13:31:11Z) - AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image
Collections [78.81539337399391]
We present an animatable 3D-aware GAN that generates portrait images with controllable facial expression, head pose, and shoulder movements.
It is a generative model trained on unstructured 2D image collections without using 3D or video data.
A dual-camera rendering and adversarial learning scheme is proposed to improve the quality of the generated faces.
arXiv Detail & Related papers (2023-09-05T12:44:57Z) - Single-Shot Implicit Morphable Faces with Consistent Texture
Parameterization [91.52882218901627]
We propose a novel method for constructing implicit 3D morphable face models that are both generalizable and intuitive for editing.
Our method improves upon photo-realism, geometry, and expression accuracy compared to state-of-the-art methods.
arXiv Detail & Related papers (2023-05-04T17:58:40Z) - FaceLit: Neural 3D Relightable Faces [28.0806453092185]
FaceLit is capable of generating a 3D face that can be rendered at various user-defined lighting conditions and views.
We show state-of-the-art photorealism among 3D aware GANs on FFHQ dataset achieving an FID score of 3.5.
arXiv Detail & Related papers (2023-03-27T17:59:10Z) - GAUDI: A Neural Architect for Immersive 3D Scene Generation [67.97817314857917]
GAUDI is a generative model capable of capturing the distribution of complex and realistic 3D scenes that can be rendered immersively from a moving camera.
We show that GAUDI obtains state-of-the-art performance in the unconditional generative setting across multiple datasets.
arXiv Detail & Related papers (2022-07-27T19:10:32Z) - 3D GAN Inversion for Controllable Portrait Image Animation [45.55581298551192]
We leverage newly developed 3D GANs, which allow explicit control over the pose of the image subject with multi-view consistency.
The proposed technique for portrait image animation outperforms previous methods in terms of image quality, identity preservation, and pose transfer.
arXiv Detail & Related papers (2022-03-25T04:06:06Z) - MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation [69.35523133292389]
We propose a framework that a priori models physical attributes of the face explicitly, thus providing disentanglement by design.
Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models.
It achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
arXiv Detail & Related papers (2021-11-01T15:53:36Z) - FreeStyleGAN: Free-view Editable Portrait Rendering with the Camera
Manifold [5.462226912969161]
Current Generative Adversarial Networks (GANs) produce photorealistic renderings of portrait images.
We show how our approach enables the integration of a pre-trained StyleGAN into standard 3D rendering pipelines.
Our solution proposes the first truly free-viewpoint rendering of realistic faces at interactive rates.
arXiv Detail & Related papers (2021-09-20T08:59:21Z) - PIRenderer: Controllable Portrait Image Generation via Semantic Neural
Rendering [56.762094966235566]
A Portrait Image Neural Renderer is proposed to control the face motions with the parameters of three-dimensional morphable face models.
The proposed model can generate photo-realistic portrait images with accurate movements according to intuitive modifications.
Our model can generate coherent videos with convincing movements from only a single reference image and a driving audio stream.
arXiv Detail & Related papers (2021-09-17T07:24:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.