ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement
- URL: http://arxiv.org/abs/2104.02699v1
- Date: Tue, 6 Apr 2021 17:47:13 GMT
- Title: ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement
- Authors: Yuval Alaluf, Or Patashnik, Daniel Cohen-Or
- Abstract summary: We present a novel inversion scheme that extends current encoder-based inversion methods by introducing an iterative refinement mechanism.
Our residual-based encoder, named ReStyle, attains improved accuracy compared to current state-of-the-art encoder-based methods with a negligible increase in inference time.
- Score: 46.48263482909809
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, the power of unconditional image synthesis has significantly
advanced through the use of Generative Adversarial Networks (GANs). The task of
inverting an image into its corresponding latent code of the trained GAN is of
utmost importance as it allows for the manipulation of real images, leveraging
the rich semantics learned by the network. Recognizing the limitations of
current inversion approaches, in this work we present a novel inversion scheme
that extends current encoder-based inversion methods by introducing an
iterative refinement mechanism. Instead of directly predicting the latent code
of a given real image using a single pass, the encoder is tasked with
predicting a residual with respect to the current estimate of the inverted
latent code in a self-correcting manner. Our residual-based encoder, named
ReStyle, attains improved accuracy compared to current state-of-the-art
encoder-based methods with a negligible increase in inference time. We analyze
the behavior of ReStyle to gain valuable insights into its iterative nature. We
then evaluate the performance of our residual encoder and analyze its
robustness compared to optimization-based inversion and state-of-the-art
encoders.
Related papers
- InstantIR: Blind Image Restoration with Instant Generative Reference [10.703499573064537]
We introduce Instant-reference Image Restoration (InstantIR), a novel diffusion-based BIR method.
We first extract a compact representation of the input via a pre-trained vision encoder.
At each generation step, this representation is used to decode current diffusion latent and instantiate it in the generative prior.
The degraded image is then encoded with this reference, providing robust generation condition.
arXiv Detail & Related papers (2024-10-09T05:15:29Z) - $ε$-VAE: Denoising as Visual Decoding [61.29255979767292]
In generative modeling, tokenization simplifies complex data into compact, structured representations, creating a more efficient, learnable space.
Current visual tokenization methods rely on a traditional autoencoder framework, where the encoder compresses data into latent representations, and the decoder reconstructs the original input.
We propose denoising as decoding, shifting from single-step reconstruction to iterative refinement. Specifically, we replace the decoder with a diffusion process that iteratively refines noise to recover the original image, guided by the latents provided by the encoder.
We evaluate our approach by assessing both reconstruction (rFID) and generation quality (
arXiv Detail & Related papers (2024-10-05T08:27:53Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and
Editability [76.6724135757723]
GAN inversion aims to invert an input image into the latent space of a pre-trained GAN.
Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability.
We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
arXiv Detail & Related papers (2022-07-19T16:10:16Z) - Feature-Style Encoder for Style-Based GAN Inversion [1.9116784879310027]
We propose a novel architecture for GAN inversion, which we call Feature-Style encoder.
Our model achieves accurate inversion of real images from the latent space of a pre-trained style-based GAN model.
Thanks to its encoder structure, the model allows fast and accurate image editing.
arXiv Detail & Related papers (2022-02-04T15:19:34Z) - MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space.
We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.