Make It So: Steering StyleGAN for Any Image Inversion and Editing
- URL: http://arxiv.org/abs/2304.14403v1
- Date: Thu, 27 Apr 2023 17:59:24 GMT
- Title: Make It So: Steering StyleGAN for Any Image Inversion and Editing
- Authors: Anand Bhattad, Viraj Shah, Derek Hoiem, D.A. Forsyth
- Abstract summary: StyleGAN's disentangled style representation enables powerful image editing by manipulating the latent variables.
Existing GAN inversion methods struggle to maintain editing directions and produce realistic results.
We propose Make It So, a novel GAN inversion method that operates in the $mathcalZ$ (noise) space rather than the typical $mathcalW$ (latent style) space.
- Score: 16.337519991964367
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: StyleGAN's disentangled style representation enables powerful image editing
by manipulating the latent variables, but accurately mapping real-world images
to their latent variables (GAN inversion) remains a challenge. Existing GAN
inversion methods struggle to maintain editing directions and produce realistic
results.
To address these limitations, we propose Make It So, a novel GAN inversion
method that operates in the $\mathcal{Z}$ (noise) space rather than the typical
$\mathcal{W}$ (latent style) space. Make It So preserves editing capabilities,
even for out-of-domain images. This is a crucial property that was overlooked
in prior methods. Our quantitative evaluations demonstrate that Make It So
outperforms the state-of-the-art method PTI~\cite{roich2021pivotal} by a factor
of five in inversion accuracy and achieves ten times better edit quality for
complex indoor scenes.
Related papers
- The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing [3.58736715327935]
We introduce StyleFeatureEditor, a novel method that enables editing in both w-latents and F-latents.
We also present a new training pipeline specifically designed to train our model to accurately edit F-latents.
Our method is compared with state-of-the-art encoding approaches, demonstrating that our model excels in terms of reconstruction quality.
arXiv Detail & Related papers (2024-06-15T11:28:32Z) - ZONE: Zero-Shot Instruction-Guided Local Editing [56.56213730578504]
We propose a Zero-shot instructiON-guided local image Editing approach, termed ZONE.
We first convert the editing intent from the user-provided instruction into specific image editing regions through InstructPix2Pix.
We then propose a Region-IoU scheme for precise image layer extraction from an off-the-shelf segment model.
arXiv Detail & Related papers (2023-12-28T02:54:34Z) - Warping the Residuals for Image Editing with StyleGAN [5.733811543584874]
StyleGAN models show editing capabilities via their semantically interpretable latent organizations.
Many works have been proposed for inverting images into StyleGAN's latent space.
We present a novel image inversion architecture that extracts high-rate latent features and includes a flow estimation module.
arXiv Detail & Related papers (2023-12-18T18:24:18Z) - Balancing Reconstruction and Editing Quality of GAN Inversion for Real
Image Editing with StyleGAN Prior Latent Space [27.035594402482886]
We revisit StyleGANs' hyperspherical prior $mathcalZ$ and $mathcalZ+$ and integrate them into seminal GAN inversion methods to improve editing quality.
Our extensions achieve sophisticated editing quality with the aid of the StyleGAN prior.
arXiv Detail & Related papers (2023-05-31T23:27:07Z) - Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing.
We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights.
Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z) - $S^2$-Flow: Joint Semantic and Style Editing of Facial Images [16.47093005910139]
generative adversarial networks (GANs) have motivated investigations into their application for image editing.
GANs are often limited in the control they provide for performing specific edits.
We propose a method to disentangle a GAN$text'$s latent space into semantic and style spaces.
arXiv Detail & Related papers (2022-11-22T12:00:02Z) - Spatially-Adaptive Multilayer Selection for GAN Inversion and Editing [57.46189236379433]
We propose a new method to invert and edit complex images in the latent space of GANs, such as StyleGAN2.
Our key idea is to explore inversion with a collection of layers, spatially adapting the inversion process to the difficulty of the image.
arXiv Detail & Related papers (2022-06-16T17:57:49Z) - Overparameterization Improves StyleGAN Inversion [66.8300251627992]
Existing inversion approaches obtain promising yet imperfect results.
We show that this allows us to obtain near-perfect image reconstruction without the need for encoders.
Our approach also retains editability, which we demonstrate by realistically interpolating between images.
arXiv Detail & Related papers (2022-05-12T18:42:43Z) - Designing an Encoder for StyleGAN Image Manipulation [38.909059126878354]
We study the latent space of StyleGAN, the state-of-the-art unconditional generator.
We identify and analyze the existence of a distortion-editability tradeoff and a distortion-perception tradeoff within the StyleGAN latent space.
We present an encoder based on our two principles that is specifically designed for facilitating editing on real images.
arXiv Detail & Related papers (2021-02-04T17:52:38Z) - PIE: Portrait Image Embedding for Semantic Control [82.69061225574774]
We present the first approach for embedding real portrait images in the latent space of StyleGAN.
We use StyleRig, a pretrained neural network that maps the control space of a 3D morphable face model to the latent space of the GAN.
An identity energy preservation term allows spatially coherent edits while maintaining facial integrity.
arXiv Detail & Related papers (2020-09-20T17:53:51Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.