In-Domain GAN Inversion for Real Image Editing
- URL: http://arxiv.org/abs/2004.00049v3
- Date: Thu, 16 Jul 2020 09:47:36 GMT
- Title: In-Domain GAN Inversion for Real Image Editing
- Authors: Jiapeng Zhu, Yujun Shen, Deli Zhao, Bolei Zhou
- Abstract summary: A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
- Score: 56.924323432048304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent work has shown that a variety of semantics emerge in the latent space
of Generative Adversarial Networks (GANs) when being trained to synthesize
images. However, it is difficult to use these learned semantics for real image
editing. A common practice of feeding a real image to a trained GAN generator
is to invert it back to a latent code. However, existing inversion methods
typically focus on reconstructing the target image by pixel values yet fail to
land the inverted code in the semantic domain of the original latent space. As
a result, the reconstructed image cannot well support semantic editing through
varying the inverted code. To solve this problem, we propose an in-domain GAN
inversion approach, which not only faithfully reconstructs the input image but
also ensures the inverted code to be semantically meaningful for editing. We
first learn a novel domain-guided encoder to project a given image to the
native latent space of GANs. We then propose domain-regularized optimization by
involving the encoder as a regularizer to fine-tune the code produced by the
encoder and better recover the target image. Extensive experiments suggest that
our inversion method achieves satisfying real image reconstruction and more
importantly facilitates various image editing tasks, significantly
outperforming start-of-the-arts.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing.
We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights.
Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z) - High-Fidelity Image Inpainting with GAN Inversion [23.49170140410603]
In this paper, we propose a novel GAN inversion model for image inpainting, dubbed InvertFill.
Within the encoder, the pre-modulation network leverages multi-scale structures to encode more discriminative semantics into style vectors.
To reconstruct faithful and photorealistic images, a simple yet effective Soft-update Mean Latent module is designed to capture more diverse in-domain patterns that synthesize high-fidelity textures for large corruptions.
arXiv Detail & Related papers (2022-08-25T03:39:24Z) - FlexIT: Towards Flexible Semantic Image Translation [59.09398209706869]
We propose FlexIT, a novel method which can take any input image and a user-defined text instruction for editing.
First, FlexIT combines the input image and text into a single target point in the CLIP multimodal embedding space.
We iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
arXiv Detail & Related papers (2022-03-09T13:34:38Z) - GAN Inversion for Out-of-Range Images with Geometric Transformations [22.914126221037222]
We propose BDInvert, a novel GAN inversion approach to semantic editing of out-of-range images.
Our experiments show that BDInvert effectively supports semantic editing of out-of-range images with geometric transformations.
arXiv Detail & Related papers (2021-08-20T04:38:40Z) - Force-in-domain GAN inversion [0.0]
Various semantics emerge in the latent space of Generative Adversarial Networks (GANs) when being trained to generate images.
An in-domain GAN inversion approach is recently proposed to constraint the inverted code within the latent space.
We propose a force-in-domain GAN based on the in-domain GAN, which utilizes a discriminator to force the inverted code within the latent space.
arXiv Detail & Related papers (2021-07-13T13:03:53Z) - Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z) - Image-to-image Mapping with Many Domains by Sparse Attribute Transfer [71.28847881318013]
Unsupervised image-to-image translation consists of learning a pair of mappings between two domains without known pairwise correspondences between points.
Current convention is to approach this task with cycle-consistent GANs.
We propose an alternate approach that directly restricts the generator to performing a simple sparse transformation in a latent layer.
arXiv Detail & Related papers (2020-06-23T19:52:23Z) - Exploiting Deep Generative Prior for Versatile Image Restoration and
Manipulation [181.08127307338654]
This work presents an effective way to exploit the image prior captured by a generative adversarial network (GAN) trained on large-scale natural images.
The deep generative prior (DGP) provides compelling results to restore missing semantics, e.g., color, patch, resolution, of various degraded images.
arXiv Detail & Related papers (2020-03-30T17:45:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.