StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for
Real-time Image Editing
- URL: http://arxiv.org/abs/2104.14754v1
- Date: Fri, 30 Apr 2021 04:43:24 GMT
- Title: StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for
Real-time Image Editing
- Authors: Hyunsu Kim, Yunjey Choi, Junho Kim, Sungjoo Yoo, Youngjung Uh
- Abstract summary: Generative adversarial networks (GANs) synthesize realistic images from random latent vectors.
Editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder.
We propose StyleMapGAN: the intermediate latent space has spatial dimensions, and a spatially variant replaces AdaIN.
- Score: 19.495153059077367
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative adversarial networks (GANs) synthesize realistic images from
random latent vectors. Although manipulating the latent vectors controls the
synthesized outputs, editing real images with GANs suffers from i)
time-consuming optimization for projecting real images to the latent vectors,
ii) or inaccurate embedding through an encoder. We propose StyleMapGAN: the
intermediate latent space has spatial dimensions, and a spatially variant
modulation replaces AdaIN. It makes the embedding through an encoder more
accurate than existing optimization-based methods while maintaining the
properties of GANs. Experimental results demonstrate that our method
significantly outperforms state-of-the-art models in various image manipulation
tasks such as local editing and image interpolation. Last but not least,
conventional editing methods on GANs are still valid on our StyleMapGAN. Source
code is available at https://github.com/naver-ai/StyleMapGAN.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Gradient Adjusting Networks for Domain Inversion [82.72289618025084]
StyleGAN2 was demonstrated to be a powerful image generation engine that supports semantic editing.
We present a per-image optimization method that tunes a StyleGAN2 generator such that it achieves a local edit to the generator's weights.
Our experiments show a sizable gap in performance over the current state of the art in this very active domain.
arXiv Detail & Related papers (2023-02-22T14:47:57Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Mask-Guided Discovery of Semantic Manifolds in Generative Models [0.0]
StyleGAN2 generates images of human faces from random vectors in a lower-dimensional latent space.
The model behaves as a black box, providing neither control over its output nor insight into the structures it has learned from the data.
We present a method to explore the manifold of changes of spatially localized regions of the face.
arXiv Detail & Related papers (2021-05-15T18:06:38Z) - StyleCLIP: Text-Driven Manipulation of StyleGAN Imagery [71.1862388442953]
We develop a text-based interface for StyleGAN image manipulation.
We first introduce an optimization scheme that utilizes a CLIP-based loss to modify an input latent vector in response to a user-provided text prompt.
Next, we describe a latent mapper that infers a text-guided latent manipulation step for a given input image, allowing faster and more stable text-based manipulation.
arXiv Detail & Related papers (2021-03-31T17:51:25Z) - Navigating the GAN Parameter Space for Semantic Image Editing [35.622710993417456]
Generative Adversarial Networks (GANs) are an indispensable tool for visual editing.
In this paper, we significantly expand the range of visual effects achievable with the state-of-the-art models, like StyleGAN2.
arXiv Detail & Related papers (2020-11-27T15:38:56Z) - Swapping Autoencoder for Deep Image Manipulation [94.33114146172606]
We propose the Swapping Autoencoder, a deep model designed specifically for image manipulation.
The key idea is to encode an image with two independent components and enforce that any swapped combination maps to a realistic image.
Experiments on multiple datasets show that our model produces better results and is substantially more efficient compared to recent generative models.
arXiv Detail & Related papers (2020-07-01T17:59:57Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.