PandA: Unsupervised Learning of Parts and Appearances in the Feature
Maps of GANs
- URL: http://arxiv.org/abs/2206.00048v1
- Date: Tue, 31 May 2022 18:28:39 GMT
- Title: PandA: Unsupervised Learning of Parts and Appearances in the Feature
Maps of GANs
- Authors: James Oldfield, Christos Tzelepis, Yannis Panagakis, Mihalis A.
Nicolaou, Ioannis Patras
- Abstract summary: We present an architecture-agnostic approach that jointly discovers factors representing spatial parts and their appearances in an entirely unsupervised fashion.
Our method is far more efficient in terms of training time and, most importantly, provides much more accurate localized control.
- Score: 34.145110544546114
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent advances in the understanding of Generative Adversarial Networks
(GANs) have led to remarkable progress in visual editing and synthesis tasks,
capitalizing on the rich semantics that are embedded in the latent spaces of
pre-trained GANs. However, existing methods are often tailored to specific GAN
architectures and are limited to either discovering global semantic directions
that do not facilitate localized control, or require some form of supervision
through manually provided regions or segmentation masks. In this light, we
present an architecture-agnostic approach that jointly discovers factors
representing spatial parts and their appearances in an entirely unsupervised
fashion. These factors are obtained by applying a semi-nonnegative tensor
factorization on the feature maps, which in turn enables context-aware local
image editing with pixel-level control. In addition, we show that the
discovered appearance factors correspond to saliency maps that localize
concepts of interest, without using any labels. Experiments on a wide range of
GAN architectures and datasets show that, in comparison to the state of the
art, our method is far more efficient in terms of training time and, most
importantly, provides much more accurate localized control. Our code is
available at: https://github.com/james-oldfield/PandA.
Related papers
- DIAL: Dense Image-text ALignment for Weakly Supervised Semantic Segmentation [8.422110274212503]
Weakly supervised semantic segmentation approaches typically rely on class activation maps (CAMs) for initial seed generation.
We introduce DALNet, which leverages text embeddings to enhance the comprehensive understanding and precise localization of objects across different levels of granularity.
Our approach, in particular, allows for more efficient end-to-end process as a single-stage method.
arXiv Detail & Related papers (2024-09-24T06:51:49Z) - Enabling Local Editing in Diffusion Models by Joint and Individual Component Analysis [18.755311950243737]
The latent space of Diffusion Models (DMs) is not as well understood as that of Generative Adversarial Networks (GANs)
Recent research has focused on unsupervised semantic discovery in the latent space of DMs.
We introduce an unsupervised method to factorize the latent semantics learned by the denoising network of pre-trained DMs.
arXiv Detail & Related papers (2024-08-29T18:21:50Z) - Weakly-supervised deepfake localization in diffusion-generated images [4.548755617115687]
We propose a weakly-supervised localization problem based on the Xception network as the backbone architecture.
We show that the best performing detection method (based on local scores) is less sensitive to the looser supervision than to the mismatch in terms of dataset or generator.
arXiv Detail & Related papers (2023-11-08T10:27:36Z) - LAW-Diffusion: Complex Scene Generation by Diffusion with Layouts [107.11267074981905]
We propose a semantically controllable layout-AWare diffusion model, termed LAW-Diffusion.
We show that LAW-Diffusion yields the state-of-the-art generative performance, especially with coherent object relations.
arXiv Detail & Related papers (2023-08-13T08:06:18Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - Discovering Class-Specific GAN Controls for Semantic Image Synthesis [73.91655061467988]
We propose a novel method for finding spatially disentangled class-specific directions in the latent space of pretrained SIS models.
We show that the latent directions found by our method can effectively control the local appearance of semantic classes.
arXiv Detail & Related papers (2022-12-02T21:39:26Z) - Fine-Grained Object Classification via Self-Supervised Pose Alignment [42.55938966190932]
We learn a novel graph based object representation to reveal a global configuration of local parts for self-supervised pose alignment across classes.
We evaluate our method on three popular fine-grained object classification benchmarks, consistently achieving the state-of-the-art performance.
arXiv Detail & Related papers (2022-03-30T01:46:19Z) - Region-Based Semantic Factorization in GANs [67.90498535507106]
We present a highly efficient algorithm to factorize the latent semantics learned by Generative Adversarial Networks (GANs) concerning an arbitrary image region.
Through an appropriately defined generalized Rayleigh quotient, we solve such a problem without any annotations or training.
Experimental results on various state-of-the-art GAN models demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2022-02-19T17:46:02Z) - A Unified Architecture of Semantic Segmentation and Hierarchical
Generative Adversarial Networks for Expression Manipulation [52.911307452212256]
We develop a unified architecture of semantic segmentation and hierarchical GANs.
A unique advantage of our framework is that on forward pass the semantic segmentation network conditions the generative model.
We evaluate our method on two challenging facial expression translation benchmarks, AffectNet and RaFD, and a semantic segmentation benchmark, CelebAMask-HQ.
arXiv Detail & Related papers (2021-12-08T22:06:31Z) - Closed-Form Factorization of Latent Semantics in GANs [65.42778970898534]
A rich set of interpretable dimensions has been shown to emerge in the latent space of the Generative Adversarial Networks (GANs) trained for synthesizing images.
In this work, we examine the internal representation learned by GANs to reveal the underlying variation factors in an unsupervised manner.
We propose a closed-form factorization algorithm for latent semantic discovery by directly decomposing the pre-trained weights.
arXiv Detail & Related papers (2020-07-13T18:05:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.