Balancing Reconstruction and Editing Quality of GAN Inversion for Real
Image Editing with StyleGAN Prior Latent Space
- URL: http://arxiv.org/abs/2306.00241v1
- Date: Wed, 31 May 2023 23:27:07 GMT
- Title: Balancing Reconstruction and Editing Quality of GAN Inversion for Real
Image Editing with StyleGAN Prior Latent Space
- Authors: Kai Katsumata, Duc Minh Vo, Bei Liu, Hideki Nakayama
- Abstract summary: We revisit StyleGANs' hyperspherical prior $mathcalZ$ and $mathcalZ+$ and integrate them into seminal GAN inversion methods to improve editing quality.
Our extensions achieve sophisticated editing quality with the aid of the StyleGAN prior.
- Score: 27.035594402482886
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The exploration of the latent space in StyleGANs and GAN inversion exemplify
impressive real-world image editing, yet the trade-off between reconstruction
quality and editing quality remains an open problem. In this study, we revisit
StyleGANs' hyperspherical prior $\mathcal{Z}$ and $\mathcal{Z}^+$ and integrate
them into seminal GAN inversion methods to improve editing quality. Besides
faithful reconstruction, our extensions achieve sophisticated editing quality
with the aid of the StyleGAN prior. We project the real images into the
proposed space to obtain the inverted codes, by which we then move along
$\mathcal{Z}^{+}$, enabling semantic editing without sacrificing image quality.
Comprehensive experiments show that $\mathcal{Z}^{+}$ can replace the most
commonly-used $\mathcal{W}$, $\mathcal{W}^{+}$, and $\mathcal{S}$ spaces while
preserving reconstruction quality, resulting in reduced distortion of edited
images.
Related papers
- The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing [3.58736715327935]
We introduce StyleFeatureEditor, a novel method that enables editing in both w-latents and F-latents.
We also present a new training pipeline specifically designed to train our model to accurately edit F-latents.
Our method is compared with state-of-the-art encoding approaches, demonstrating that our model excels in terms of reconstruction quality.
arXiv Detail & Related papers (2024-06-15T11:28:32Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Revisiting Latent Space of GAN Inversion for Real Image Editing [27.035594402482886]
In this study, we revisit StyleGANs' hyperspherical prior $mathcalZ$ and combine it with highly capable latent spaces to build combined spaces that faithfully invert real images.
We show that $mathcalZ+$ can replace the most commonly-used $mathcalW$, $mathcalW+$, and $mathcalS$ spaces while preserving reconstruction quality, resulting in reduced distortion of edited images.
arXiv Detail & Related papers (2023-07-18T06:27:44Z) - Designing a Better Asymmetric VQGAN for StableDiffusion [73.21783102003398]
A revolutionary text-to-image generator, StableDiffusion, learns a diffusion model in the latent space via a VQGAN.
We propose a new asymmetric VQGAN with two simple designs.
It can be widely used in StableDiffusion-based inpainting and local editing methods.
arXiv Detail & Related papers (2023-06-07T17:56:02Z) - Make It So: Steering StyleGAN for Any Image Inversion and Editing [16.337519991964367]
StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables.
Existing GAN inversion methods struggle to maintain editing directions and produce realistic results.
We propose Make It So, a novel GAN inversion method that operates in the $mathcalZ$ (noise) space rather than the typical $mathcalW$ (latent style) space.
arXiv Detail & Related papers (2023-04-27T17:59:24Z) - StyleRes: Transforming the Residuals for Real Image Editing with
StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem.
Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge.
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z) - Delving StyleGAN Inversion for Image Editing: A Foundation Latent Space
Viewpoint [76.00222741383375]
GAN inversion and editing via StyleGAN maps an input image into the embedding spaces ($mathcalW$, $mathcalW+$, and $mathcalF$) to simultaneously maintain image fidelity and meaningful manipulation.
Recent GAN inversion methods typically explore $mathcalW+$ and $mathcalF$ rather than $mathcalW$ to improve reconstruction fidelity while maintaining editability.
We introduce contrastive learning to align $mathcalW$ and the image space for precise latent
arXiv Detail & Related papers (2022-11-21T13:35:32Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z) - Transforming the Latent Space of StyleGAN for Real Face Editing [35.93066942205814]
We propose to expand the latent space by replacing fully-connected layers in the StyleGAN's mapping network with attention-based transformers.
This simple and effective technique integrates the aforementioned two spaces and transforms them into one new latent space called $W$++.
Our modified StyleGAN maintains the state-of-the-art generation quality of the original StyleGAN with moderately better diversity.
But more importantly, the proposed $W$++ space achieves superior performance in both reconstruction quality and editing quality.
arXiv Detail & Related papers (2021-05-29T06:42:23Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.