Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and
Editability
- URL: http://arxiv.org/abs/2207.09367v1
- Date: Tue, 19 Jul 2022 16:10:16 GMT
- Title: Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and
Editability
- Authors: Xudong Mao, Liujuan Cao, Aurele T. Gnanha, Zhenguo Yang, Qing Li,
Rongrong Ji
- Abstract summary: GAN inversion aims to invert an input image into the latent space of a pre-trained GAN.
Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability.
We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
- Score: 76.6724135757723
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: GAN inversion aims to invert an input image into the latent space of a
pre-trained GAN. Despite the recent advances in GAN inversion, there remain
challenges to mitigate the tradeoff between distortion and editability, i.e.
reconstructing the input image accurately and editing the inverted image with a
small visual quality drop. The recently proposed pivotal tuning model makes
significant progress towards reconstruction and editability, by using a
two-step approach that first inverts the input image into a latent code, called
pivot code, and then alters the generator so that the input image can be
accurately mapped into the pivot code. Here, we show that both reconstruction
and editability can be improved by a proper design of the pivot code. We
present a simple yet effective method, named cycle encoding, for a high-quality
pivot code. The key idea of our method is to progressively train an encoder in
varying spaces according to a cycle scheme: W->W+->W. This training methodology
preserves the properties of both W and W+ spaces, i.e. high editability of W
and low distortion of W+. To further decrease the distortion, we also propose
to refine the pivot code with an optimization-based method, where a
regularization term is introduced to reduce the degradation in editability.
Qualitative and quantitative comparisons to several state-of-the-art methods
demonstrate the superiority of our approach.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - LSAP: Rethinking Inversion Fidelity, Perception and Editability in GAN
Latent Space [42.56147568941768]
We introduce Normalized Style Space and $mathcalSN$ Cosine Distance to measure disalignment of inversion methods.
Our proposed SNCD is differentiable, it can be optimized in both encoder-based and optimization-based embedding methods to conduct a uniform solution.
arXiv Detail & Related papers (2022-09-26T14:55:21Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - High-Fidelity GAN Inversion for Image Attribute Editing [61.966946442222735]
We present a novel high-fidelity generative adversarial network (GAN) inversion framework that enables attribute editing with image-specific details well-preserved.
With a low bit-rate latent code, previous works have difficulties in preserving high-fidelity details in reconstructed and edited images.
We propose a distortion consultation approach that employs a distortion map as a reference for high-fidelity reconstruction.
arXiv Detail & Related papers (2021-09-14T11:23:48Z) - ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement [46.48263482909809]
We present a novel inversion scheme that extends current encoder-based inversion methods by introducing an iterative refinement mechanism.
Our residual-based encoder, named ReStyle, attains improved accuracy compared to current state-of-the-art encoder-based methods with a negligible increase in inference time.
arXiv Detail & Related papers (2021-04-06T17:47:13Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.