Designing an Encoder for StyleGAN Image Manipulation
- URL: http://arxiv.org/abs/2102.02766v1
- Date: Thu, 4 Feb 2021 17:52:38 GMT
- Title: Designing an Encoder for StyleGAN Image Manipulation
- Authors: Omer Tov, Yuval Alaluf, Yotam Nitzan, Or Patashnik, Daniel Cohen-Or
- Abstract summary: We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
- Score: 38.909059126878354
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, there has been a surge of diverse methods for performing image
editing by employing pre-trained unconditional generators. Applying these
methods on real images, however, remains a challenge, as it necessarily
requires the inversion of the images into their latent space. To successfully
invert a real image, one needs to find a latent code that reconstructs the
input image accurately, and more importantly, allows for its meaningful
manipulation. In this paper, we carefully study the latent space of StyleGAN,
the state-of-the-art unconditional generator. We identify and analyze the
existence of a distortion-editability tradeoff and a distortion-perception
tradeoff within the StyleGAN latent space. We then suggest two principles for
designing encoders in a manner that allows one to control the proximity of the
inversions to regions that StyleGAN was originally trained on. We present an
encoder based on our two principles that is specifically designed for
facilitating editing on real images by balancing these tradeoffs. By evaluating
its performance qualitatively and quantitatively on numerous challenging
domains, including cars and horses, we show that our inversion method, followed
by common editing techniques, achieves superior real-image editing quality,
with only a small reconstruction accuracy drop.
Related papers
- A Compact and Semantic Latent Space for Disentangled and Controllable
Image Editing [4.8201607588546]
We propose an auto-encoder which re-organizes the latent space of StyleGAN, so that each attribute which we wish to edit corresponds to an axis of the new latent space.
We show that our approach has greater disentanglement than competing methods, while maintaining fidelity to the original image with respect to identity.
arXiv Detail & Related papers (2023-12-13T16:18:45Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - Expanding the Latent Space of StyleGAN for Real Face Editing [4.1715767752637145]
A surge of face editing techniques have been proposed to employ the pretrained StyleGAN for semantic manipulation.
To successfully edit a real image, one must first convert the input image into StyleGAN's latent variables.
We present a method to expand the latent space of StyleGAN with additional content features to break down the trade-off between low-distortion and high-editability.
arXiv Detail & Related papers (2022-04-26T18:27:53Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z) - Pivotal Tuning for Latent-based Editing of Real Images [40.22151052441958]
A surge of advanced facial editing techniques have been proposed that leverage the generative power of a pre-trained StyleGAN.
To successfully edit an image this way, one must first project (or invert) the image into the pre-trained generator's domain.
This means it is still challenging to apply ID-preserving facial latent-space editing to faces which are out of the generator's domain.
arXiv Detail & Related papers (2021-06-10T13:47:59Z) - Enjoy Your Editing: Controllable GANs for Image Editing via Latent Space
Navigation [136.53288628437355]
Controllable semantic image editing enables a user to change entire image attributes with few clicks.
Current approaches often suffer from attribute edits that are entangled, global image identity changes, and diminished photo-realism.
We propose quantitative evaluation strategies for measuring controllable editing performance, unlike prior work which primarily focuses on qualitative evaluation.
arXiv Detail & Related papers (2021-02-01T21:38:36Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.