Related papers: ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement

ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement

URL: http://arxiv.org/abs/2104.02699v1
Date: Tue, 6 Apr 2021 17:47:13 GMT
Title: ReStyle: A Residual-Based StyleGAN Encoder via Iterative Refinement
Authors: Yuval Alaluf, Or Patashnik, Daniel Cohen-Or
Abstract summary: We present a novel inversion scheme that extends current encoder-based inversion methods by introducing an iterative refinement mechanism. Our residual-based encoder, named ReStyle, attains improved accuracy compared to current state-of-the-art encoder-based methods with a negligible increase in inference time.
Score: 46.48263482909809
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recently, the power of unconditional image synthesis has significantly advanced through the use of Generative Adversarial Networks (GANs). The task of inverting an image into its corresponding latent code of the trained GAN is of utmost importance as it allows for the manipulation of real images, leveraging the rich semantics learned by the network. Recognizing the limitations of current inversion approaches, in this work we present a novel inversion scheme that extends current encoder-based inversion methods by introducing an iterative refinement mechanism. Instead of directly predicting the latent code of a given real image using a single pass, the encoder is tasked with predicting a residual with respect to the current estimate of the inverted latent code in a self-correcting manner. Our residual-based encoder, named ReStyle, attains improved accuracy compared to current state-of-the-art encoder-based methods with a negligible increase in inference time. We analyze the behavior of ReStyle to gain valuable insights into its iterative nature. We then evaluate the performance of our residual encoder and analyze its robustness compared to optimization-based inversion and state-of-the-art encoders.

Related papers

InstantIR: Blind Image Restoration with Instant Generative Reference [10.703499573064537]
We introduce Instant-reference Image Restoration (InstantIR), a novel diffusion-based BIR method. We first extract a compact representation of the input via a pre-trained vision encoder. At each generation step, this representation is used to decode current diffusion latent and instantiate it in the generative prior. The degraded image is then encoded with this reference, providing robust generation condition.
arXiv Detail & Related papers (2024-10-09T05:15:29Z)
$ε$-VAE: Denoising as Visual Decoding [61.29255979767292]
In generative modeling, tokenization simplifies complex data into compact, structured representations, creating a more efficient, learnable space. Current visual tokenization methods rely on a traditional autoencoder framework, where the encoder compresses data into latent representations, and the decoder reconstructs the original input. We propose denoising as decoding, shifting from single-step reconstruction to iterative refinement. Specifically, we replace the decoder with a diffusion process that iteratively refines noise to recover the original image, guided by the latents provided by the encoder. We evaluate our approach by assessing both reconstruction (rFID) and generation quality (
arXiv Detail & Related papers (2024-10-05T08:27:53Z)
In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model. We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z)
Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability [76.6724135757723]
GAN inversion aims to invert an input image into the latent space of a pre-trained GAN. Despite the recent advances in GAN inversion, there remain challenges to mitigate the tradeoff between distortion and editability. We propose a two-step approach that first inverts the input image into a latent code, called pivot code, and then alters the generator so that the input image can be accurately mapped into the pivot code.
arXiv Detail & Related papers (2022-07-19T16:10:16Z)
Feature-Style Encoder for Style-Based GAN Inversion [1.9116784879310027]
We propose a novel architecture for GAN inversion, which we call Feature-Style encoder. Our model achieves accurate inversion of real images from the latent space of a pre-trained style-based GAN model. Thanks to its encoder structure, the model allows fast and accurate image editing.
arXiv Detail & Related papers (2022-02-04T15:19:34Z)
Inverting Adversarially Robust Networks for Image Synthesis [37.927552662984034]
We propose the use of robust representations as a perceptual primitive for feature inversion models. We empirically show that adopting robust representations as an image prior significantly improves the reconstruction accuracy of CNN-based feature inversion models. Following these findings, we propose an encoding-decoding network based on robust representations and show its advantages for applications such as anomaly detection, style transfer and image denoising.
arXiv Detail & Related papers (2021-06-13T05:51:00Z)
MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space. We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)
In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code. Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space. We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.