Disentangled Image Generation Through Structured Noise Injection
- URL: http://arxiv.org/abs/2004.12411v2
- Date: Tue, 5 May 2020 15:41:16 GMT
- Title: Disentangled Image Generation Through Structured Noise Injection
- Authors: Yazeed Alharbi, Peter Wonka
- Abstract summary: We show that disentanglement in the first layer of the generator network leads to disentanglement in the generated image.
We achieve spatial disentanglement, scale-space disentanglement, and disentanglement of the foreground object from the background style.
This empirically leads to better disentanglement scores than state-of-the-art methods on the FFHQ dataset.
- Score: 48.956122902434444
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We explore different design choices for injecting noise into generative
adversarial networks (GANs) with the goal of disentangling the latent space.
Instead of traditional approaches, we propose feeding multiple noise codes
through separate fully-connected layers respectively. The aim is restricting
the influence of each noise code to specific parts of the generated image. We
show that disentanglement in the first layer of the generator network leads to
disentanglement in the generated image. Through a grid-based structure, we
achieve several aspects of disentanglement without complicating the network
architecture and without requiring labels. We achieve spatial disentanglement,
scale-space disentanglement, and disentanglement of the foreground object from
the background style allowing fine-grained control over the generated images.
Examples include changing facial expressions in face images, changing beak
length in bird images, and changing car dimensions in car images. This
empirically leads to better disentanglement scores than state-of-the-art
methods on the FFHQ dataset.
Related papers
- Compressive Sensing with Tensorized Autoencoder [22.89029876274012]
In many cases, different images in a collection are articulated versions of one another.
In this paper, our goal is to recover images without access to the ground-truth (clean) images using the articulations as structural prior to the data.
We propose to learn autoencoder with tensor ring factorization on the the embedding space to impose structural constraints on the data.
arXiv Detail & Related papers (2023-03-10T22:59:09Z) - Dual Pyramid Generative Adversarial Networks for Semantic Image
Synthesis [94.76988562653845]
The goal of semantic image synthesis is to generate photo-realistic images from semantic label maps.
Current state-of-the-art approaches, however, still struggle to generate realistic objects in images at various scales.
We propose a Dual Pyramid Generative Adversarial Network (DP-GAN) that learns the conditioning of spatially-adaptive normalization blocks at all scales jointly.
arXiv Detail & Related papers (2022-10-08T18:45:44Z) - BlobGAN: Spatially Disentangled Scene Representations [67.60387150586375]
We propose an unsupervised, mid-level representation for a generative model of scenes.
The representation is mid-level in that it is neither per-pixel nor per-image; rather, scenes are modeled as a collection of spatial, depth-ordered "blobs" of features.
arXiv Detail & Related papers (2022-05-05T17:59:55Z) - Alias-Free Generative Adversarial Networks [48.09216521763342]
generative adversarial networks depend on absolute pixel coordinates in an unhealthy manner.
We trace the root cause to careless signal processing that causes aliasing in the generator network.
Our results pave the way for generative models better suited for video and animation.
arXiv Detail & Related papers (2021-06-23T14:20:01Z) - Ensembling with Deep Generative Views [72.70801582346344]
generative models can synthesize "views" of artificial images that mimic real-world variations, such as changes in color or pose.
Here, we investigate whether such views can be applied to real images to benefit downstream analysis tasks such as image classification.
We use StyleGAN2 as the source of generative augmentations and investigate this setup on classification tasks involving facial attributes, cat faces, and cars.
arXiv Detail & Related papers (2021-04-29T17:58:35Z) - LatentKeypointGAN: Controlling Images via Latent Keypoints [23.670795505376336]
We introduce LatentKeypointGAN, a two-stage GAN trained end-to-end on the classical GAN objective.
LatentKeypointGAN provides an interpretable latent space that can be used to re-arrange the generated images.
In addition, the explicit generation of keypoints and matching images enables a new, GAN-based method for unsupervised keypoint detection.
arXiv Detail & Related papers (2021-03-29T17:59:10Z) - Deep-Masking Generative Network: A Unified Framework for Background
Restoration from Superimposed Images [36.7646332887842]
We present the Deep-Masking Generative Network (DMGN), which is a unified framework for background restoration from superimposed images.
A coarse background image and a noise image are first generated in parallel, then the noise image is further leveraged to refine the background image.
Our experiments show that our DMGN consistently outperforms state-of-the-art methods specifically designed for each single task.
arXiv Detail & Related papers (2020-10-09T01:47:52Z) - Reconstructing the Noise Manifold for Image Denoising [56.562855317536396]
We introduce the idea of a cGAN which explicitly leverages structure in the image noise space.
By learning directly a low dimensional manifold of the image noise, the generator promotes the removal from the noisy image only that information which spans this manifold.
Based on our experiments, our model substantially outperforms existing state-of-the-art architectures.
arXiv Detail & Related papers (2020-02-11T00:31:31Z) - OneGAN: Simultaneous Unsupervised Learning of Conditional Image
Generation, Foreground Segmentation, and Fine-Grained Clustering [100.32273175423146]
We present a method for simultaneously learning, in an unsupervised manner, a conditional image generator, foreground extraction and segmentation, and object removal and background completion.
The method combines a Geneversarative Adrial Network and a Variational Auto-Encoder, with multiple encoders, generators and discriminators, and benefits from solving all tasks at once.
arXiv Detail & Related papers (2019-12-31T18:15:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.