Discovering Class-Specific GAN Controls for Semantic Image Synthesis
- URL: http://arxiv.org/abs/2212.01455v1
- Date: Fri, 2 Dec 2022 21:39:26 GMT
- Title: Discovering Class-Specific GAN Controls for Semantic Image Synthesis
- Authors: Edgar Sch\"onfeld, Julio Borges, Vadim Sushko, Bernt Schiele, Anna
Khoreva
- Abstract summary: We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
- Score: 73.91655061467988
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prior work has extensively studied the latent space structure of GANs for
unconditional image synthesis, enabling global editing of generated images by
the unsupervised discovery of interpretable latent directions. However, the
discovery of latent directions for conditional GANs for semantic image
synthesis (SIS) has remained unexplored. In this work, we specifically focus on
addressing this gap. We propose a novel optimization method for finding
spatially disentangled class-specific directions in the latent space of
pretrained SIS models. We show that the latent directions found by our method
can effectively control the local appearance of semantic classes, e.g.,
changing their internal structure, texture or color independently from each
other. Visual inspection and quantitative evaluation of the discovered GAN
controls on various datasets demonstrate that our method discovers a diverse
set of unique and semantically meaningful latent directions for class-specific
edits.
Related papers
- Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis [18.755311950243737]
The latent space of Diffusion Models (DMs) is not as well understood as that of Generative Adversarial Networks (GANs)
Recent research has focused on unsupervised semantic discovery in the latent space of DMs.
We introduce an unsupervised method to factorize the latent semantics learned by the denoising network of pre-trained DMs.
arXiv Detail & Related papers (2024-08-29T18:21:50Z) - Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models [21.173910627285338]
Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs)
In this paper, we explore the properties of h-space and propose several novel methods for finding meaningful semantic directions within it.
Our approaches are applicable without requiring architectural modifications, text-based guidance, CLIP-based optimization, or model fine-tuning.
arXiv Detail & Related papers (2023-03-20T12:59:32Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z) - Interpreting the Latent Space of GANs via Correlation Analysis for
Controllable Concept Manipulation [9.207806788490057]
Generative adversarial nets (GANs) have been successfully applied in many fields like image generation, inpainting, super-resolution and drug discovery.
This paper proposes a method for interpreting the latent space of GANs by analyzing the correlation between latent variables and the corresponding semantic contents in generated images.
arXiv Detail & Related papers (2020-05-23T03:50:27Z) - InterFaceGAN: Interpreting the Disentangled Face Representation Learned
by GANs [73.27299786083424]
We propose a framework called InterFaceGAN to interpret the disentangled face representation learned by state-of-the-art GAN models.
We first find that GANs learn various semantics in some linear subspaces of the latent space.
We then conduct a detailed study on the correlation between different semantics and manage to better disentangle them via subspace projection.
arXiv Detail & Related papers (2020-05-18T18:01:22Z) - GANSpace: Discovering Interpretable GAN Controls [24.428247009562895]
This paper describes a technique to analyze Generative Adversarial Networks (GANs) and create interpretable controls for image synthesis.
We identify important latent directions based on Principal Components Analysis (PCA) applied either in latent space or feature space.
We show that a large number of interpretable controls can be defined by layer-wise perturbation along the principal directions.
arXiv Detail & Related papers (2020-04-06T10:41:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.