Cluster-guided Image Synthesis with Unconditional Models
- URL: http://arxiv.org/abs/2112.12911v1
- Date: Fri, 24 Dec 2021 02:18:34 GMT
- Title: Cluster-guided Image Synthesis with Unconditional Models
- Authors: Markos Georgopoulos, James Oldfield, Grigorios G Chrysos, Yannis
Panagakis
- Abstract summary: This work focuses on controllable image generation by leveraging GANs that are well-trained in an unsupervised fashion.
By conditioning on the cluster assignments, the proposed method is able to control the semantic class of the generated image.
We showcase the efficacy of our approach on faces (CelebA-HQ and FFHQ), animals (Imagenet) and objects (LSUN) using different pre-trained generative models.
- Score: 41.89334167530054
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generative Adversarial Networks (GANs) are the driving force behind the
state-of-the-art in image generation. Despite their ability to synthesize
high-resolution photo-realistic images, generating content with on-demand
conditioning of different granularity remains a challenge. This challenge is
usually tackled by annotating massive datasets with the attributes of interest,
a laborious task that is not always a viable option. Therefore, it is vital to
introduce control into the generation process of unsupervised generative
models. In this work, we focus on controllable image generation by leveraging
GANs that are well-trained in an unsupervised fashion. To this end, we discover
that the representation space of intermediate layers of the generator forms a
number of clusters that separate the data according to semantically meaningful
attributes (e.g., hair color and pose). By conditioning on the cluster
assignments, the proposed method is able to control the semantic class of the
generated image. Our approach enables sampling from each cluster by Implicit
Maximum Likelihood Estimation (IMLE). We showcase the efficacy of our approach
on faces (CelebA-HQ and FFHQ), animals (Imagenet) and objects (LSUN) using
different pre-trained generative models. The results highlight the ability of
our approach to condition image generation on attributes like gender, pose and
hair style on faces, as well as a variety of features on different object
classes.
Related papers
- Attack Deterministic Conditional Image Generative Models for Diverse and
Controllable Generation [17.035117118768945]
We propose a plug-in projected gradient descent (PGD) like method for diverse and controllable image generation.
The key idea is attacking the pre-trained deterministic generative models by adding a micro perturbation to the input condition.
Our work opens the door to applying adversarial attack to low-level vision tasks.
arXiv Detail & Related papers (2024-03-13T06:57:23Z) - Active Generation for Image Classification [45.93535669217115]
We propose to address the efficiency of image generation by focusing on the specific needs and characteristics of the model.
With a central tenet of active learning, our method, named ActGen, takes a training-aware approach to image generation.
arXiv Detail & Related papers (2024-03-11T08:45:31Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Conditioning Diffusion Models via Attributes and Semantic Masks for Face
Generation [1.104121146441257]
Deep generative models have shown impressive results in generating realistic images of faces.
GANs managed to generate high-quality, high-fidelity images when conditioned on semantic masks, but they still lack the ability to diversify their output.
We propose a multi-conditioning approach for diffusion models via cross-attention exploiting both attributes and semantic masks to generate high-quality and controllable face images.
arXiv Detail & Related papers (2023-06-01T17:16:37Z) - InvGAN: Invertible GANs [88.58338626299837]
InvGAN, short for Invertible GAN, successfully embeds real images to the latent space of a high quality generative model.
This allows us to perform image inpainting, merging, and online data augmentation.
arXiv Detail & Related papers (2021-12-08T21:39:00Z) - Controllable and Compositional Generation with Latent-Space Energy-Based
Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications.
In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes.
By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z) - Collaging Class-specific GANs for Semantic Image Synthesis [68.87294033259417]
We propose a new approach for high resolution semantic image synthesis.
It consists of one base image generator and multiple class-specific generators.
Experiments show that our approach can generate high quality images in high resolution.
arXiv Detail & Related papers (2021-10-08T17:46:56Z) - Unsupervised Discovery of Disentangled Manifolds in GANs [74.24771216154105]
Interpretable generation process is beneficial to various image editing applications.
We propose a framework to discover interpretable directions in the latent space given arbitrary pre-trained generative adversarial networks.
arXiv Detail & Related papers (2020-11-24T02:18:08Z) - Generative Hierarchical Features from Synthesizing Images [65.66756821069124]
We show that learning to synthesize images can bring remarkable hierarchical visual features that are generalizable across a wide range of applications.
The visual feature produced by our encoder, termed as Generative Hierarchical Feature (GH-Feat), has strong transferability to both generative and discriminative tasks.
arXiv Detail & Related papers (2020-07-20T18:04:14Z) - Network Bending: Expressive Manipulation of Deep Generative Models [0.2062593640149624]
We introduce a new framework for manipulating and interacting with deep generative models that we call network bending.
We show how it allows for the direct manipulation of semantically meaningful aspects of the generative process as well as allowing for a broad range of expressive outcomes.
arXiv Detail & Related papers (2020-05-25T21:48:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.