Unsupervised Discovery of Interpretable Directions in the GAN Latent
Space
- URL: http://arxiv.org/abs/2002.03754v3
- Date: Wed, 24 Jun 2020 12:12:14 GMT
- Title: Unsupervised Discovery of Interpretable Directions in the GAN Latent
Space
- Authors: Andrey Voynov, Artem Babenko
- Abstract summary: latent spaces of GAN models often have semantically meaningful directions.
We introduce an unsupervised method to identify interpretable directions in the latent space of a pretrained GAN model.
We show how to exploit this finding to achieve competitive performance for weakly-supervised saliency detection.
- Score: 39.54530450932134
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The latent spaces of GAN models often have semantically meaningful
directions. Moving in these directions corresponds to human-interpretable image
transformations, such as zooming or recoloring, enabling a more controllable
generation process. However, the discovery of such directions is currently
performed in a supervised manner, requiring human labels, pretrained models, or
some form of self-supervision. These requirements severely restrict a range of
directions existing approaches can discover. In this paper, we introduce an
unsupervised method to identify interpretable directions in the latent space of
a pretrained GAN model. By a simple model-agnostic procedure, we find
directions corresponding to sensible semantic manipulations without any form of
(self-)supervision. Furthermore, we reveal several non-trivial findings, which
would be difficult to obtain by existing methods, e.g., a direction
corresponding to background removal. As an immediate practical benefit of our
work, we show how to exploit this finding to achieve competitive performance
for weakly-supervised saliency detection.
Related papers
- Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - Unsupervised Discovery of Interpretable Directions in h-space of
Pre-trained Diffusion Models [63.1637853118899]
We propose the first unsupervised and learning-based method to identify interpretable directions in h-space of pre-trained diffusion models.
We employ a shift control module that works on h-space of pre-trained diffusion models to manipulate a sample into a shifted version of itself.
By jointly optimizing them, the model will spontaneously discover disentangled and interpretable directions.
arXiv Detail & Related papers (2023-10-15T18:44:30Z) - Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models [21.173910627285338]
Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs)
In this paper, we explore the properties of h-space and propose several novel methods for finding meaningful semantic directions within it.
Our approaches are applicable without requiring architectural modifications, text-based guidance, CLIP-based optimization, or model fine-tuning.
arXiv Detail & Related papers (2023-03-20T12:59:32Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - Fantastic Style Channels and Where to Find Them: A Submodular Framework
for Discovering Diverse Directions in GANs [0.0]
StyleGAN2 has enabled various image generation and manipulation tasks due to its rich and disentangled latent spaces.
We design a novel submodular framework that finds the most representative and diverse subset of directions in the latent space of StyleGAN2.
Our framework promotes diversity by using the notion of clusters and can be efficiently solved with a greedy optimization scheme.
arXiv Detail & Related papers (2022-03-16T10:35:41Z) - LARGE: Latent-Based Regression through GAN Semantics [42.50535188836529]
We propose a novel method for solving regression tasks using few-shot or weak supervision.
We show that our method can be applied across a wide range of domains, leverage multiple latent direction discovery frameworks, and achieve state-of-the-art results.
arXiv Detail & Related papers (2021-07-22T17:55:35Z) - LatentCLR: A Contrastive Learning Approach for Unsupervised Discovery of
Interpretable Directions [0.02294014185517203]
We propose a contrastive-learning-based approach for discovering semantic directions in the latent space of pretrained GANs.
Our approach finds semantically meaningful dimensions compatible with state-of-the-art methods.
arXiv Detail & Related papers (2021-04-02T00:11:22Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Manifolds for Unsupervised Visual Anomaly Detection [79.22051549519989]
Unsupervised learning methods that don't necessarily encounter anomalies in training would be immensely useful.
We develop a novel hyperspherical Variational Auto-Encoder (VAE) via stereographic projections with a gyroplane layer.
We present state-of-the-art results on visual anomaly benchmarks in precision manufacturing and inspection, demonstrating real-world utility in industrial AI scenarios.
arXiv Detail & Related papers (2020-06-19T20:41:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.