Overparameterization Improves StyleGAN Inversion
- URL: http://arxiv.org/abs/2205.06304v1
- Date: Thu, 12 May 2022 18:42:43 GMT
- Title: Overparameterization Improves StyleGAN Inversion
- Authors: Yohan Poirier-Ginter, Alexandre Lessard, Ryan Smith, Jean-Fran\c{c}ois
Lalonde
- Abstract summary: Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
- Score: 66.8300251627992
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep generative models like StyleGAN hold the promise of semantic image
editing: modifying images by their content, rather than their pixel values.
Unfortunately, working with arbitrary images requires inverting the StyleGAN
generator, which has remained challenging so far. Existing inversion approaches
obtain promising yet imperfect results, having to trade-off between
reconstruction quality and downstream editability. To improve quality, these
approaches must resort to various techniques that extend the model latent space
after training. Taking a step back, we observe that these methods essentially
all propose, in one way or another, to increase the number of free parameters.
This suggests that inversion might be difficult because it is underconstrained.
In this work, we address this directly and dramatically overparameterize the
latent space, before training, with simple changes to the original StyleGAN
architecture. Our overparameterization increases the available degrees of
freedom, which in turn facilitates inversion. We show that this allows us to
obtain near-perfect image reconstruction without the need for encoders nor for
altering the latent space after training. Our approach also retains
editability, which we demonstrate by realistically interpolating between
images.
Related papers
- Latent Space Editing in Transformer-Based Flow Matching [53.75073756305241]
Flow Matching with a transformer backbone offers the potential for scalable and high-quality generative modeling.
We introduce an editing space, $u$-space, that can be manipulated in a controllable, accumulative, and composable manner.
Lastly, we put forth a straightforward yet powerful method for achieving fine-grained and nuanced editing using text prompts.
arXiv Detail & Related papers (2023-12-17T21:49:59Z) - In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Robust GAN inversion [5.1359892878090845]
We propose an approach which works in native latent space $W$ and tunes the generator network to restore missing image details.
We demonstrate the effectiveness of our approach on two complex datasets: Flickr-Faces-HQ and LSUN Church.
arXiv Detail & Related papers (2023-08-31T07:47:11Z) - Robust Unsupervised StyleGAN Image Restoration [5.33024001730262]
GAN-based image restoration inverts the generative process to repair images corrupted by known degradations.
We make StyleGAN image restoration robust, working across a wide range of degradation levels.
Our proposed approach relies on a 3-phase progressive latent space extension and a conservative robustness.
arXiv Detail & Related papers (2023-02-13T22:45:54Z) - StyleRes: Transforming the Residuals for Real Image Editing with
StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem.
Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge.
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - HyperStyle: StyleGAN Inversion with HyperNetworks for Real Image Editing [2.362412515574206]
HyperStyle learns to modulate StyleGAN's weights to faithfully express a given image in editable regions of the latent space.
HyperStyle yields reconstructions comparable to those of optimization techniques with the near real-time inference capabilities of encoders.
arXiv Detail & Related papers (2021-11-30T18:56:30Z) - Controllable Person Image Synthesis with Spatially-Adaptive Warped
Normalization [72.65828901909708]
Controllable person image generation aims to produce realistic human images with desirable attributes.
We introduce a novel Spatially-Adaptive Warped Normalization (SAWN), which integrates a learned flow-field to warp modulation parameters.
We propose a novel self-training part replacement strategy to refine the pretrained model for the texture-transfer task.
arXiv Detail & Related papers (2021-05-31T07:07:44Z) - Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.