Navigating the GAN Parameter Space for Semantic Image Editing
- URL: http://arxiv.org/abs/2011.13786v3
- Date: Wed, 21 Apr 2021 12:45:11 GMT
- Title: Navigating the GAN Parameter Space for Semantic Image Editing
- Authors: Anton Cherepkov, Andrey Voynov, Artem Babenko
- Abstract summary: Generative Adversarial Networks (GANs) are an indispensable tool for visual editing.
In this paper, we significantly expand the range of visual effects achievable with the state-of-the-art models, like StyleGAN2.
- Score: 35.622710993417456
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) are currently an indispensable tool
for visual editing, being a standard component of image-to-image translation
and image restoration pipelines. Furthermore, GANs are especially useful for
controllable generation since their latent spaces contain a wide range of
interpretable directions, well suited for semantic editing operations. By
gradually changing latent codes along these directions, one can produce
impressive visual effects, unattainable without GANs.
In this paper, we significantly expand the range of visual effects achievable
with the state-of-the-art models, like StyleGAN2. In contrast to existing
works, which mostly operate by latent codes, we discover interpretable
directions in the space of the generator parameters. By several simple methods,
we explore this space and demonstrate that it also contains a plethora of
interpretable directions, which are an excellent source of non-trivial semantic
manipulations. The discovered manipulations cannot be achieved by transforming
the latent codes and can be used to edit both synthetic and real images. We
release our code and models and hope they will serve as a handy tool for
further efforts on GAN-based image editing.
Related papers
- In-Domain GAN Inversion for Faithful Reconstruction and Editability [132.68255553099834]
We propose in-domain GAN inversion, which consists of a domain-guided domain-regularized and a encoder to regularize the inverted code in the native latent space of the pre-trained GAN model.
We make comprehensive analyses on the effects of the encoder structure, the starting inversion point, as well as the inversion parameter space, and observe the trade-off between the reconstruction quality and the editing property.
arXiv Detail & Related papers (2023-09-25T08:42:06Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - Towards Counterfactual Image Manipulation via CLIP [106.94502632502194]
Existing methods can achieve realistic editing of different visual attributes such as age and gender of facial images.
We investigate this problem in a text-driven manner with Contrastive-Language-Image-Pretraining (CLIP)
We design a novel contrastive loss that exploits predefined CLIP-space directions to guide the editing toward desired directions from different perspectives.
arXiv Detail & Related papers (2022-07-06T17:02:25Z) - Decorating Your Own Bedroom: Locally Controlling Image Generation with
Generative Adversarial Networks [15.253043666814413]
We propose an effective approach, termed as LoGAN, to support local editing of the output image.
We are able to seamlessly remove, insert, shift, and rotate the individual objects inside a room.
Our method can completely clear out a room and then refurnish it with customized furniture and styles.
arXiv Detail & Related papers (2021-05-18T01:31:49Z) - StyleMapGAN: Exploiting Spatial Dimensions of Latent in GAN for
Real-time Image Editing [19.495153059077367]
Generative adversarial networks (GANs) synthesize realistic images from random latent vectors.
Editing real images with GANs suffers from i) time-consuming optimization for projecting real images to the latent vectors, ii) or inaccurate embedding through an encoder.
We propose StyleMapGAN: the intermediate latent space has spatial dimensions, and a spatially variant replaces AdaIN.
arXiv Detail & Related papers (2021-04-30T04:43:24Z) - Linear Semantics in Generative Adversarial Networks [26.123252503846942]
We aim to better understand the semantic representation of GANs, and enable semantic control in GAN's generation process.
We find that a well-trained GAN encodes image semantics in its internal feature maps in a surprisingly simple way.
We propose two few-shot image editing approaches, namely Semantic-Conditional Sampling and Semantic Image Editing.
arXiv Detail & Related papers (2021-04-01T14:18:48Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z) - In-Domain GAN Inversion for Real Image Editing [56.924323432048304]
A common practice of feeding a real image to a trained GAN generator is to invert it back to a latent code.
Existing inversion methods typically focus on reconstructing the target image by pixel values yet fail to land the inverted code in the semantic domain of the original latent space.
We propose an in-domain GAN inversion approach, which faithfully reconstructs the input image and ensures the inverted code to be semantically meaningful for editing.
arXiv Detail & Related papers (2020-03-31T18:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.