Rethinking conditional GAN training: An approach using geometrically
structured latent manifolds
- URL: http://arxiv.org/abs/2011.13055v3
- Date: Wed, 2 Jun 2021 11:50:46 GMT
- Title: Rethinking conditional GAN training: An approach using geometrically
structured latent manifolds
- Authors: Sameera Ramasinghe, Moshiur Farazi, Salman Khan, Nick Barnes, Stephen
Gould
- Abstract summary: Conditional GANs (cGAN) suffer from critical drawbacks such as the lack of diversity in generated outputs.
We propose a novel training mechanism that increases both the diversity and the visual quality of a vanilla cGAN.
- Score: 58.07468272236356
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Conditional GANs (cGAN), in their rudimentary form, suffer from critical
drawbacks such as the lack of diversity in generated outputs and distortion
between the latent and output manifolds. Although efforts have been made to
improve results, they can suffer from unpleasant side-effects such as the
topology mismatch between latent and output spaces. In contrast, we tackle this
problem from a geometrical perspective and propose a novel training mechanism
that increases both the diversity and the visual quality of a vanilla cGAN, by
systematically encouraging a bi-lipschitz mapping between the latent and the
output manifolds. We validate the efficacy of our solution on a baseline cGAN
(i.e., Pix2Pix) which lacks diversity, and show that by only modifying its
training mechanism (i.e., with our proposed Pix2Pix-Geo), one can achieve more
diverse and realistic outputs on a broad set of image-to-image translation
tasks. Codes are available at https://github.com/samgregoost/Rethinking-CGANs.
Related papers
- A Contrastive Variational Graph Auto-Encoder for Node Clustering [10.52321770126932]
State-of-the-art clustering methods have numerous challenges.
Existing VGAEs do not account for the discrepancy between the inference and generative models.
Our solution has two mechanisms to control the trade-off between Feature Randomness and Feature Drift.
arXiv Detail & Related papers (2023-12-28T05:07:57Z) - Compressing Image-to-Image Translation GANs Using Local Density
Structures on Their Learned Manifold [69.33930972652594]
Generative Adversarial Networks (GANs) have shown remarkable success in modeling complex data distributions for image-to-image translation.
Existing GAN compression methods mainly rely on knowledge distillation or convolutional classifiers' pruning techniques.
We propose a new approach by explicitly encouraging the pruned model to preserve the density structure of the original parameter-heavy model on its learned manifold.
Our experiments on image translation GAN models, Pix2Pix and CycleGAN, with various benchmark datasets and architectures demonstrate our method's effectiveness.
arXiv Detail & Related papers (2023-12-22T15:43:12Z) - A Bayesian Non-parametric Approach to Generative Models: Integrating
Variational Autoencoder and Generative Adversarial Networks using Wasserstein
and Maximum Mean Discrepancy [2.966338139852619]
Generative adversarial networks (GANs) and variational autoencoders (VAEs) are two of the most prominent and widely studied generative models.
We employ a Bayesian non-parametric (BNP) approach to merge GANs and VAEs.
By fusing the discriminative power of GANs with the reconstruction capabilities of VAEs, our novel model achieves superior performance in various generative tasks.
arXiv Detail & Related papers (2023-08-27T08:58:31Z) - A Closer Look at Few-shot Image Generation [38.83570296616384]
When transferring pretrained GANs on small target data, the generator tends to replicate the training samples.
Several methods have been proposed to address this few-shot image generation, but there is a lack of effort to analyze them under a unified framework.
We propose a framework to analyze existing methods during the adaptation.
Second contribution proposes to apply mutual information (MI) to retain the source domain's rich multi-level diversity information in the target domain generator.
arXiv Detail & Related papers (2022-05-08T07:46:26Z) - Maximum Spatial Perturbation Consistency for Unpaired Image-to-Image
Translation [56.44946660061753]
This paper proposes a universal regularization technique called maximum spatial perturbation consistency (MSPC)
MSPC enforces a spatial perturbation function (T ) and the translation operator (G) to be commutative (i.e., TG = GT )
Our method outperforms the state-of-the-art methods on most I2I benchmarks.
arXiv Detail & Related papers (2022-03-23T19:59:04Z) - Heterogeneous Face Frontalization via Domain Agnostic Learning [74.86585699909459]
We propose a domain agnostic learning-based generative adversarial network (DAL-GAN) which can synthesize frontal views in the visible domain from thermal faces with pose variations.
DAL-GAN consists of a generator with an auxiliary classifier and two discriminators which capture both local and global texture discriminations for better synthesis.
arXiv Detail & Related papers (2021-07-17T20:41:41Z) - Hierarchical Modes Exploring in Generative Adversarial Networks [14.557204104822215]
In conditional Generative Adversarial Networks (cGANs), when two different initial noises are paired with the same conditional information, minor modes are likely to collapse into large modes.
We propose a hierarchical mode exploring method to alleviate mode collapse in cGANs by introducing a diversity measurement into the objective function as the regularization term.
arXiv Detail & Related papers (2020-03-05T10:43:50Z) - When Relation Networks meet GANs: Relation GANs with Triplet Loss [110.7572918636599]
Training stability is still a lingering concern of generative adversarial networks (GANs)
In this paper, we explore a relation network architecture for the discriminator and design a triplet loss which performs better generalization and stability.
Experiments on benchmark datasets show that the proposed relation discriminator and new loss can provide significant improvement on variable vision tasks.
arXiv Detail & Related papers (2020-02-24T11:35:28Z) - Image Fine-grained Inpainting [89.17316318927621]
We present a one-stage model that utilizes dense combinations of dilated convolutions to obtain larger and more effective receptive fields.
To better train this efficient generator, except for frequently-used VGG feature matching loss, we design a novel self-guided regression loss.
We also employ a discriminator with local and global branches to ensure local-global contents consistency.
arXiv Detail & Related papers (2020-02-07T03:45:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.