Fantastic Style Channels and Where to Find Them: A Submodular Framework
for Discovering Diverse Directions in GANs
- URL: http://arxiv.org/abs/2203.08516v1
- Date: Wed, 16 Mar 2022 10:35:41 GMT
- Title: Fantastic Style Channels and Where to Find Them: A Submodular Framework
for Discovering Diverse Directions in GANs
- Authors: Enis Simsar and Umut Kocasari and Ezgi G\"ulperi Er and Pinar Yanardag
- Abstract summary: StyleGAN2 has enabled various image generation and manipulation tasks due to its rich and disentangled latent spaces.
We design a novel submodular framework that finds the most representative and diverse subset of directions in the latent space of StyleGAN2.
Our framework promotes diversity by using the notion of clusters and can be efficiently solved with a greedy optimization scheme.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The discovery of interpretable directions in the latent spaces of pre-trained
GAN models has recently become a popular topic. In particular, StyleGAN2 has
enabled various image generation and manipulation tasks due to its rich and
disentangled latent spaces. The discovery of such directions is typically done
either in a supervised manner, which requires annotated data for each desired
manipulation or in an unsupervised manner, which requires a manual effort to
identify the directions. As a result, existing work typically finds only a
handful of directions in which controllable edits can be made. In this study,
we design a novel submodular framework that finds the most representative and
diverse subset of directions in the latent space of StyleGAN2. Our approach
takes advantage of the latent space of channel-wise style parameters, so-called
stylespace, in which we cluster channels that perform similar manipulations
into groups. Our framework promotes diversity by using the notion of clusters
and can be efficiently solved with a greedy optimization scheme. We evaluate
our framework with qualitative and quantitative experiments and show that our
method finds more diverse and disentangled directions. Our project page can be
found at http://catlab-team.github.io/fantasticstyles.
Related papers
- Image Captioning via Dynamic Path Customization [100.15412641586525]
We propose a novel Dynamic Transformer Network (DTNet) for image captioning, which dynamically assigns customized paths to different samples, leading to discriminative yet accurate captions.
To validate the effectiveness of our proposed DTNet, we conduct extensive experiments on the MS-COCO dataset and achieve new state-of-the-art performance.
arXiv Detail & Related papers (2024-06-01T07:23:21Z) - Discovering Interpretable Directions in the Semantic Latent Space of Diffusion Models [21.173910627285338]
Denoising Diffusion Models (DDMs) have emerged as a strong competitor to Generative Adversarial Networks (GANs)
In this paper, we explore the properties of h-space and propose several novel methods for finding meaningful semantic directions within it.
Our approaches are applicable without requiring architectural modifications, text-based guidance, CLIP-based optimization, or model fine-tuning.
arXiv Detail & Related papers (2023-03-20T12:59:32Z) - Spatial Steerability of GANs via Self-Supervision from Discriminator [123.27117057804732]
We propose a self-supervised approach to improve the spatial steerability of GANs without searching for steerable directions in the latent space.
Specifically, we design randomly sampled Gaussian heatmaps to be encoded into the intermediate layers of generative models as spatial inductive bias.
During inference, users can interact with the spatial heatmaps in an intuitive manner, enabling them to edit the output image by adjusting the scene layout, moving, or removing objects.
arXiv Detail & Related papers (2023-01-20T07:36:29Z) - ContraFeat: Contrasting Deep Features for Semantic Discovery [102.4163768995288]
StyleGAN has shown strong potential for disentangled semantic control.
Existing semantic discovery methods on StyleGAN rely on manual selection of modified latent layers to obtain satisfactory manipulation results.
We propose a model that automates this process and achieves state-of-the-art semantic discovery performance.
arXiv Detail & Related papers (2022-12-14T15:22:13Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - Exploring Gradient-based Multi-directional Controls in GANs [19.950198707910587]
We propose a novel approach that discovers nonlinear controls, which enables multi-directional manipulation as well as effective disentanglement.
Our approach is able to gain fine-grained controls over a diverse set of bi-directional and multi-directional attributes, and we showcase its ability to achieve disentanglement significantly better than state-of-the-art methods.
arXiv Detail & Related papers (2022-09-01T19:10:26Z) - Attribute-specific Control Units in StyleGAN for Fine-grained Image
Manipulation [57.99007520795998]
We discover attribute-specific control units, which consist of multiple channels of feature maps and modulation styles.
Specifically, we collaboratively manipulate the modulation style channels and feature maps in control units to obtain the semantic and spatial disentangled controls.
We move the modulation style along a specific sparse direction vector and replace the filter-wise styles used to compute the feature maps to manipulate these control units.
arXiv Detail & Related papers (2021-11-25T10:42:10Z) - LatentCLR: A Contrastive Learning Approach for Unsupervised Discovery of
Interpretable Directions [0.02294014185517203]
We propose a contrastive-learning-based approach for discovering semantic directions in the latent space of pretrained GANs.
Our approach finds semantically meaningful dimensions compatible with state-of-the-art methods.
arXiv Detail & Related papers (2021-04-02T00:11:22Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Unsupervised Discovery of Interpretable Directions in the GAN Latent
Space [39.54530450932134]
latent spaces of GAN models often have semantically meaningful directions.
We introduce an unsupervised method to identify interpretable directions in the latent space of a pretrained GAN model.
We show how to exploit this finding to achieve competitive performance for weakly-supervised saliency detection.
arXiv Detail & Related papers (2020-02-10T13:57:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.