Linear Semantics in Generative Adversarial Networks
- URL: http://arxiv.org/abs/2104.00487v1
- Date: Thu, 1 Apr 2021 14:18:48 GMT
- Title: Linear Semantics in Generative Adversarial Networks
- Authors: Jianjin Xu, Changxi Zheng
- Abstract summary: We aim to better understand the semantic representation of GANs, and enable semantic control in GAN's generation process.
We find that a well-trained GAN encodes image semantics in its internal feature maps in a surprisingly simple way.
We propose two few-shot image editing approaches, namely Semantic-Conditional Sampling and Semantic Image Editing.
- Score: 26.123252503846942
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Generative Adversarial Networks (GANs) are able to generate high-quality
images, but it remains difficult to explicitly specify the semantics of
synthesized images. In this work, we aim to better understand the semantic
representation of GANs, and thereby enable semantic control in GAN's generation
process. Interestingly, we find that a well-trained GAN encodes image semantics
in its internal feature maps in a surprisingly simple way: a linear
transformation of feature maps suffices to extract the generated image
semantics. To verify this simplicity, we conduct extensive experiments on
various GANs and datasets; and thanks to this simplicity, we are able to learn
a semantic segmentation model for a trained GAN from a small number (e.g., 8)
of labeled images. Last but not least, leveraging our findings, we propose two
few-shot image editing approaches, namely Semantic-Conditional Sampling and
Semantic Image Editing. Given a trained GAN and as few as eight semantic
annotations, the user is able to generate diverse images subject to a
user-provided semantic layout, and control the synthesized image semantics. We
have made the code publicly available.
Related papers
- SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow [94.90853153808987]
Semantic segmentation and semantic image synthesis are representative tasks in visual perception and generation.
We propose a unified framework (SemFlow) and model them as a pair of reverse problems.
Experiments show that our SemFlow achieves competitive results on semantic segmentation and semantic image synthesis tasks.
arXiv Detail & Related papers (2024-05-30T17:34:40Z) - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis [29.688029979801577]
We propose a new class of GAN discriminators for semantic image synthesis that generates highly realistic images.
Our model, which we dub DP-SIMS, achieves state-of-the-art results in terms of image quality and consistency with the input label maps on ADE-20K, COCO-Stuff, and Cityscapes.
arXiv Detail & Related papers (2023-12-20T09:39:19Z) - Extracting Semantic Knowledge from GANs with Unsupervised Learning [65.32631025780631]
Generative Adversarial Networks (GANs) encode semantics in feature maps in a linearly separable form.
We propose a novel clustering algorithm, named KLiSH, which leverages the linear separability to cluster GAN's features.
KLiSH succeeds in extracting fine-grained semantics of GANs trained on datasets of various objects.
arXiv Detail & Related papers (2022-11-30T03:18:16Z) - RepMix: Representation Mixing for Robust Attribution of Synthesized
Images [15.698564265127432]
We present a solution capable of matching images invariant to their semantic content.
We then propose RepMix, our GAN fingerprinting technique based on representation mixing and a novel loss.
We show our approach improves significantly from existing GAN fingerprinting works on both semantic generalization and robustness.
arXiv Detail & Related papers (2022-07-05T14:14:06Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - FlexIT: Towards Flexible Semantic Image Translation [59.09398209706869]
We propose FlexIT, a novel method which can take any input image and a user-defined text instruction for editing.
First, FlexIT combines the input image and text into a single target point in the CLIP multimodal embedding space.
We iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.
arXiv Detail & Related papers (2022-03-09T13:34:38Z) - Navigating the GAN Parameter Space for Semantic Image Editing [35.622710993417456]
Generative Adversarial Networks (GANs) are an indispensable tool for visual editing.
In this paper, we significantly expand the range of visual effects achievable with the state-of-the-art models, like StyleGAN2.
arXiv Detail & Related papers (2020-11-27T15:38:56Z) - Controllable Image Synthesis via SegVAE [89.04391680233493]
A semantic map is commonly used intermediate representation for conditional image generation.
In this work, we specifically target at generating semantic maps given a label-set consisting of desired categories.
The proposed framework, SegVAE, synthesizes semantic maps in an iterative manner using conditional variational autoencoder.
arXiv Detail & Related papers (2020-07-16T15:18:53Z) - Edge Guided GANs with Contrastive Learning for Semantic Image Synthesis [194.1452124186117]
We propose a novel ECGAN for the challenging semantic image synthesis task.
Our ECGAN achieves significantly better results than state-of-the-art methods.
arXiv Detail & Related papers (2020-03-31T01:23:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.