3D GAN Inversion for Controllable Portrait Image Animation
- URL: http://arxiv.org/abs/2203.13441v1
- Date: Fri, 25 Mar 2022 04:06:06 GMT
- Title: 3D GAN Inversion for Controllable Portrait Image Animation
- Authors: Connor Z. Lin, David B. Lindell, Eric R. Chan, and Gordon Wetzstein
- Abstract summary: We leverage newly developed 3D GANs, which allow explicit control over the pose of the image subject with multi-view consistency.
The proposed technique for portrait image animation outperforms previous methods in terms of image quality, identity preservation, and pose transfer.
- Score: 45.55581298551192
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Millions of images of human faces are captured every single day; but these
photographs portray the likeness of an individual with a fixed pose,
expression, and appearance. Portrait image animation enables the post-capture
adjustment of these attributes from a single image while maintaining a
photorealistic reconstruction of the subject's likeness or identity. Still,
current methods for portrait image animation are typically based on 2D warping
operations or manipulations of a 2D generative adversarial network (GAN) and
lack explicit mechanisms to enforce multi-view consistency. Thus these methods
may significantly alter the identity of the subject, especially when the
viewpoint relative to the camera is changed. In this work, we leverage newly
developed 3D GANs, which allow explicit control over the pose of the image
subject with multi-view consistency. We propose a supervision strategy to
flexibly manipulate expressions with 3D morphable models, and we show that the
proposed method also supports editing appearance attributes, such as age or
hairstyle, by interpolating within the latent space of the GAN. The proposed
technique for portrait image animation outperforms previous methods in terms of
image quality, identity preservation, and pose transfer while also supporting
attribute editing.
Related papers
- DiffPortrait3D: Controllable Diffusion for Zero-Shot Portrait View Synthesis [18.64688172651478]
We present DiffPortrait3D, a conditional diffusion model capable of synthesizing 3D-consistent photo-realistic novel views.
Given a single RGB input, we aim to synthesize plausible but consistent facial details rendered from novel camera views.
We demonstrate state-of-the-art results both qualitatively and quantitatively on our challenging in-the-wild and multi-view benchmarks.
arXiv Detail & Related papers (2023-12-20T13:31:11Z) - AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image
Collections [78.81539337399391]
We present an animatable 3D-aware GAN that generates portrait images with controllable facial expression, head pose, and shoulder movements.
It is a generative model trained on unstructured 2D image collections without using 3D or video data.
A dual-camera rendering and adversarial learning scheme is proposed to improve the quality of the generated faces.
arXiv Detail & Related papers (2023-09-05T12:44:57Z) - Ray Conditioning: Trading Photo-consistency for Photo-realism in
Multi-view Image Generation [10.300893339754827]
We propose ray conditioning, a geometry-free alternative that relaxes the photo-consistency constraint.
Our method generates multi-view images by conditioning a 2D GAN on a light field prior.
With explicit viewpoint control, state-of-the-art photo-realism and identity consistency, our method is particularly suited for the viewpoint editing task.
arXiv Detail & Related papers (2023-04-26T16:54:10Z) - Explicitly Controllable 3D-Aware Portrait Generation [42.30481422714532]
We propose a 3D portrait generation network that produces consistent portraits according to semantic parameters regarding pose, identity, expression and lighting.
Our method outperforms prior arts in extensive experiments, producing realistic portraits with vivid expression in natural lighting when viewed in free viewpoint.
arXiv Detail & Related papers (2022-09-12T17:40:08Z) - MOST-GAN: 3D Morphable StyleGAN for Disentangled Face Image Manipulation [69.35523133292389]
We propose a framework that a priori models physical attributes of the face explicitly, thus providing disentanglement by design.
Our method, MOST-GAN, integrates the expressive power and photorealism of style-based GANs with the physical disentanglement and flexibility of nonlinear 3D morphable models.
It achieves photorealistic manipulation of portrait images with fully disentangled 3D control over their physical attributes, enabling extreme manipulation of lighting, facial expression, and pose variations up to full profile view.
arXiv Detail & Related papers (2021-11-01T15:53:36Z) - Pixel Sampling for Style Preserving Face Pose Editing [53.14006941396712]
We present a novel two-stage approach to solve the dilemma, where the task of face pose manipulation is cast into face inpainting.
By selectively sampling pixels from the input face and slightly adjust their relative locations, the face editing result faithfully keeps the identity information as well as the image style unchanged.
With the 3D facial landmarks as guidance, our method is able to manipulate face pose in three degrees of freedom, i.e., yaw, pitch, and roll, resulting in more flexible face pose editing.
arXiv Detail & Related papers (2021-06-14T11:29:29Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.