Closed-Form Factorization of Latent Semantics in GANs
- URL: http://arxiv.org/abs/2007.06600v4
- Date: Sat, 3 Apr 2021 13:30:22 GMT
- Title: Closed-Form Factorization of Latent Semantics in GANs
- Authors: Yujun Shen, Bolei Zhou
- Abstract summary: A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
- Score: 65.42778970898534
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A rich set of interpretable dimensions has been shown to emerge in the latent
space of the Generative Adversarial Networks (GANs) trained for synthesizing
images. In order to identify such latent dimensions for image editing, previous
methods typically annotate a collection of synthesized samples and train linear
classifiers in the latent space. However, they require a clear definition of
the target attribute as well as the corresponding manual annotations, limiting
their applications in practice. In this work, we examine the internal
representation learned by GANs to reveal the underlying variation factors in an
unsupervised manner. In particular, we take a closer look into the generation
mechanism of GANs and further propose a closed-form factorization algorithm for
latent semantic discovery by directly decomposing the pre-trained weights. With
a lightning-fast implementation, our approach is capable of not only finding
semantically meaningful dimensions comparably to the state-of-the-art
supervised methods, but also resulting in far more versatile concepts across
multiple GAN models trained on a wide range of datasets.
Related papers
- Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - Diffusion Features to Bridge Domain Gap for Semantic Segmentation [2.8616666231199424]
This paper investigates the approach that leverages the sampling and fusion techniques to harness the features of diffusion models efficiently.
By leveraging the strength of text-to-image generation capability, we introduce a new training framework designed to implicitly learn posterior knowledge from it.
arXiv Detail & Related papers (2024-06-02T15:33:46Z) - Exploring Compositional Visual Generation with Latent Classifier
Guidance [19.48538300223431]
We train latent diffusion models and auxiliary latent classifiers to facilitate non-linear navigation of latent representation generation.
We show that such conditional generation achieved by latent classifier guidance provably maximizes a lower bound of the conditional log probability during training.
We show that this paradigm based on latent classifier guidance is agnostic to pre-trained generative models, and present competitive results for both image generation and sequential manipulation of real and synthetic images.
arXiv Detail & Related papers (2023-04-25T03:02:58Z) - Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models [21.173910627285338]
Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs)
In this paper, we explore the properties of h-space and propose several novel methods for finding meaningful semantic directions within it.
Our approaches are applicable without requiring architectural modifications, text-based guidance, CLIP-based optimization, or model fine-tuning.
arXiv Detail & Related papers (2023-03-20T12:59:32Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - GSMFlow: Generation Shifts Mitigating Flow for Generalized Zero-Shot
Learning [55.79997930181418]
Generalized Zero-Shot Learning aims to recognize images from both the seen and unseen classes by transferring semantic knowledge from seen to unseen classes.
It is a promising solution to take the advantage of generative models to hallucinate realistic unseen samples based on the knowledge learned from the seen classes.
We propose a novel flow-based generative framework that consists of multiple conditional affine coupling layers for learning unseen data generation.
arXiv Detail & Related papers (2022-07-05T04:04:37Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Network Bending: Expressive Manipulation of Deep Generative Models [0.2062593640149624]
We introduce a new framework for manipulating and interacting with deep generative models that we call network bending.
We show how it allows for the direct manipulation of semantically meaningful aspects of the generative process as well as allowing for a broad range of expressive outcomes.
arXiv Detail & Related papers (2020-05-25T21:48:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.