In-Domain GAN Inversion for Faithful Reconstruction and Editability
- URL: http://arxiv.org/abs/2309.13956v1
- Date: Mon, 25 Sep 2023 08:42:06 GMT
- Title: In-Domain GAN Inversion for Faithful Reconstruction and Editability
- Authors: Jiapeng Zhu, Yujun Shen, Yinghao Xu, Deli Zhao, Qifeng Chen, Bolei
Zhou
- Abstract summary: We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
- Score: 132.68255553099834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) have significantly advanced image
synthesis through mapping randomly sampled latent codes to high-fidelity
synthesized images. However, applying well-trained GANs to real image editing
remains challenging. A common solution is to find an approximate latent code
that can adequately recover the input image to edit, which is also known as GAN
inversion. To invert a GAN model, prior works typically focus on reconstructing
the target image at the pixel level, yet few studies are conducted on whether
the inverted result can well support manipulation at the semantic level. This
work fills in this gap by proposing in-domain GAN inversion, which consists of
a domain-guided encoder and a domain-regularized optimizer, to regularize the
inverted code in the native latent space of the pre-trained GAN model. In this
way, we manage to sufficiently reuse the knowledge learned by GANs for image
reconstruction, facilitating a wide range of editing applications without any
retraining. We further make comprehensive analyses on the effects of the
encoder structure, the starting inversion point, as well as the inversion
parameter space, and observe the trade-off between the reconstruction quality
and the editing property. Such a trade-off sheds light on how a GAN model
represents an image with various semantics encoded in the learned latent
distribution. Code, models, and demo are available at the project page:
https://genforce.github.io/idinvert/.
Related papers
- High-Fidelity Image Inpainting with GAN Inversion [23.49170140410603]
In this paper, we propose a novel GAN inversion model for image inpainting, dubbed InvertFill.
Within the encoder, the pre-modulation network leverages multi-scale structures to encode more discriminative semantics into style vectors.
To reconstruct faithful and photorealistic images, a simple yet effective Soft-update Mean Latent module is designed to capture more diverse in-domain patterns that synthesize high-fidelity textures for large corruptions.
arXiv Detail & Related papers (2022-08-25T03:39:24Z) - Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and
Editability [76.6724135757723]
GAN inversion aims to invert an input image into the latent space of a pre-trained GAN.
Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability.
We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
arXiv Detail & Related papers (2022-07-19T16:10:16Z) - Editing Out-of-domain GAN Inversion via Differential Activations [56.62964029959131]
We propose a novel GAN prior based editing framework to tackle the out-of-domain inversion problem with a composition-decomposition paradigm.
With the aid of the generated Diff-CAM mask, a coarse reconstruction can intuitively be composited by the paired original and edited images.
In the decomposition phase, we further present a GAN prior based deghosting network for separating the final fine edited image from the coarse reconstruction.
arXiv Detail & Related papers (2022-07-17T10:34:58Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z) - GAN Inversion for Out-of-Range Images with Geometric Transformations [22.914126221037222]
We propose BDInvert, a novel GAN inversion approach to semantic editing of out-of-range images.
Our experiments show that BDInvert effectively supports semantic editing of out-of-range images with geometric transformations.
arXiv Detail & Related papers (2021-08-20T04:38:40Z) - Force-in-domain GAN inversion [0.0]
Various semantics emerge in the latent space of Generative Adversarial Networks (GANs) when being trained to generate images.
An in-domain GAN inversion approach is recently proposed to constraint the inverted code within the latent space.
We propose a force-in-domain GAN based on the in-domain GAN, which utilizes a discriminator to force the inverted code within the latent space.
arXiv Detail & Related papers (2021-07-13T13:03:53Z) - GAN Inversion: A Survey [125.62848237531945]
GAN inversion aims to invert a given image back into the latent space of a pretrained GAN model.
GAN inversion plays an essential role in enabling the pretrained GAN models such as StyleGAN and BigGAN to be used for real image editing applications.
arXiv Detail & Related papers (2021-01-14T14:11:00Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z) - Toward a Controllable Disentanglement Network [22.968760397814993]
This paper addresses two crucial problems of learning disentangled image representations, namely controlling the degree of disentanglement during image editing, and balancing the disentanglement strength and the reconstruction quality.
By exploring the real-valued space of the soft target representation, we are able to synthesize novel images with the designated properties.
arXiv Detail & Related papers (2020-01-22T16:54:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.