Improved StyleGAN Embedding: Where are the Good Latents?
- URL: http://arxiv.org/abs/2012.09036v2
- Date: Mon, 5 Apr 2021 00:01:21 GMT
- Title: Improved StyleGAN Embedding: Where are the Good Latents?
- Authors: Peihao Zhu, Rameen Abdal, Yipeng Qin, John Femiani, Peter Wonka
- Abstract summary: StyleGAN is able to produce photorealistic images that are almost indistinguishable from real ones.
The reverse problem of finding an embedding for a given image poses a challenge.
In this paper, we address the problem of finding an embedding that both reconstructs images and also supports image editing tasks.
- Score: 43.780075713984935
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: StyleGAN is able to produce photorealistic images that are almost
indistinguishable from real ones. The reverse problem of finding an embedding
for a given image poses a challenge. Embeddings that reconstruct an image well
are not always robust to editing operations. In this paper, we address the
problem of finding an embedding that both reconstructs images and also supports
image editing tasks. First, we introduce a new normalized space to analyze the
diversity and the quality of the reconstructed latent codes. This space can
help answer the question of where good latent codes are located in latent
space. Second, we propose an improved embedding algorithm using a novel
regularization method based on our analysis. Finally, we analyze the quality of
different embedding algorithms. We compare our results with the current
state-of-the-art methods and achieve a better trade-off between reconstruction
quality and editing quality.
Related papers
- ENTED: Enhanced Neural Texture Extraction and Distribution for
Reference-based Blind Face Restoration [51.205673783866146]
We present ENTED, a new framework for blind face restoration that aims to restore high-quality and realistic portrait images.
We utilize a texture extraction and distribution framework to transfer high-quality texture features between the degraded input and reference image.
The StyleGAN-like architecture in our framework requires high-quality latent codes to generate realistic images.
arXiv Detail & Related papers (2024-01-13T04:54:59Z) - Warping the Residuals for Image Editing with StyleGAN [5.733811543584874]
StyleGAN models show editing capabilities via their semantically interpretable latent organizations.
Many works have been proposed for inverting images into StyleGAN's latent space.
We present a novel image inversion architecture that extracts high-rate latent features and includes a flow estimation module.
arXiv Detail & Related papers (2023-12-18T18:24:18Z) - Diverse Inpainting and Editing with GAN Inversion [4.234367850767171]
Recent inversion methods have shown that real images can be inverted into StyleGAN's latent space.
In this paper, we tackle an even more difficult task, inverting erased images into GAN's latent space for realistic inpaintings and editings.
arXiv Detail & Related papers (2023-07-27T17:41:36Z) - StyleDiffusion: Prompt-Embedding Inversion for Text-Based Editing [86.92711729969488]
We exploit the amazing capacities of pretrained diffusion models for the editing of images.
They either finetune the model, or invert the image in the latent space of the pretrained model.
They suffer from two problems: Unsatisfying results for selected regions, and unexpected changes in nonselected regions.
arXiv Detail & Related papers (2023-03-28T00:16:45Z) - StyleRes: Transforming the Residuals for Real Image Editing with
StyleGAN [4.7590051176368915]
Inverting real images into StyleGAN's latent space is an extensively studied problem.
Trade-off between the image reconstruction fidelity and image editing quality remains an open challenge.
We present a novel image inversion framework and a training pipeline to achieve high-fidelity image inversion with high-quality editing.
arXiv Detail & Related papers (2022-12-29T16:14:09Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - ARCH++: Animation-Ready Clothed Human Reconstruction Revisited [82.83445332309238]
We present ARCH++, an image-based method to reconstruct 3D avatars with arbitrary clothing styles.
Our reconstructed avatars are animation-ready and highly realistic, in both the visible regions from input views and the unseen regions.
arXiv Detail & Related papers (2021-08-17T19:27:12Z) - Using latent space regression to analyze and leverage compositionality
in GANs [33.381584322411626]
We investigate regression into the latent space as a probe to understand the compositional properties of GANs.
We find that combining the regressor and a pretrained generator provides a strong image prior, allowing us to create composite images.
We find that the regression approach enables more localized editing of individual image parts compared to direct editing in the latent space.
arXiv Detail & Related papers (2021-03-18T17:58:01Z) - Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.