Hierarchical Modes Exploring in Generative Adversarial Networks
- URL: http://arxiv.org/abs/2003.08752v1
- Date: Thu, 5 Mar 2020 10:43:50 GMT
- Title: Hierarchical Modes Exploring in Generative Adversarial Networks
- Authors: Mengxiao Hu, Jinlong Li, Maolin Hu, Tao Hu
- Abstract summary: In conditional Generative Adversarial Networks (cGANs), when two different initial noises are paired with the same conditional information, minor modes are likely to collapse into large modes.
We propose a hierarchical mode exploring method to alleviate mode collapse in cGANs by introducing a diversity measurement into the objective function as the regularization term.
- Score: 14.557204104822215
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In conditional Generative Adversarial Networks (cGANs), when two different
initial noises are concatenated with the same conditional information, the
distance between their outputs is relatively smaller, which makes minor modes
likely to collapse into large modes. To prevent this happen, we proposed a
hierarchical mode exploring method to alleviate mode collapse in cGANs by
introducing a diversity measurement into the objective function as the
regularization term. We also introduced the Expected Ratios of Expansion (ERE)
into the regularization term, by minimizing the sum of differences between the
real change of distance and ERE, we can control the diversity of generated
images w.r.t specific-level features. We validated the proposed algorithm on
four conditional image synthesis tasks including categorical generation, paired
and un-paired image translation and text-to-image generation. Both qualitative
and quantitative results show that the proposed method is effective in
alleviating the mode collapse problem in cGANs, and can control the diversity
of output images w.r.t specific-level features.
Related papers
- A Simple Approach to Unifying Diffusion-based Conditional Generation [63.389616350290595]
We introduce a simple, unified framework to handle diverse conditional generation tasks.
Our approach enables versatile capabilities via different inference-time sampling schemes.
Our model supports additional capabilities like non-spatially aligned and coarse conditioning.
arXiv Detail & Related papers (2024-10-15T09:41:43Z) - Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion [37.18537753482751]
Conditional Diffusion Relaxing Inversion (CRDI) is designed to enhance distribution diversity in synthetic image generation.
CRDI does not rely on fine-tuning based on only a few samples.
It focuses on reconstructing each target image instance and expanding diversity through few-shot learning.
arXiv Detail & Related papers (2024-07-09T21:58:26Z) - Improving Denoising Diffusion Probabilistic Models via Exploiting Shared
Representations [5.517338199249029]
SR-DDPM is a class of generative models that produce high-quality images by reversing a noisy diffusion process.
By exploiting the similarity between diverse data distributions, our method can scale to multiple tasks without compromising the image quality.
We evaluate our method on standard image datasets and show that it outperforms both unconditional and conditional DDPM in terms of FID and SSIM metrics.
arXiv Detail & Related papers (2023-11-27T22:30:26Z) - Mutual-Guided Dynamic Network for Image Fusion [51.615598671899335]
We propose a novel mutual-guided dynamic network (MGDN) for image fusion, which allows for effective information utilization across different locations and inputs.
Experimental results on five benchmark datasets demonstrate that our proposed method outperforms existing methods on four image fusion tasks.
arXiv Detail & Related papers (2023-08-24T03:50:37Z) - Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment
for Markup-to-Image Generation [15.411325887412413]
This paper proposes a novel model named "Contrast-augmented Diffusion Model with Fine-grained Sequence Alignment" (FSA-CDM)
FSA-CDM introduces contrastive positive/negative samples into the diffusion model to boost performance for markup-to-image generation.
Experiments are conducted on four benchmark datasets from different domains.
arXiv Detail & Related papers (2023-08-02T13:43:03Z) - Semantic Image Synthesis via Diffusion Models [159.4285444680301]
Denoising Diffusion Probabilistic Models (DDPMs) have achieved remarkable success in various image generation tasks.
Recent work on semantic image synthesis mainly follows the emphde facto Generative Adversarial Nets (GANs)
arXiv Detail & Related papers (2022-06-30T18:31:51Z) - Large Scale Image Completion via Co-Modulated Generative Adversarial
Networks [18.312552957727828]
We propose a generic new approach that bridges the gap between image-conditional and recent unconditional generative architectures.
Also, due to the lack of good quantitative metrics for image completion, we propose the new Paired/Unpaired Inception Discriminative Score (P-IDS/U-IDS)
Experiments demonstrate superior performance in terms of both quality and diversity over state-of-the-art methods in free-form image completion and easy generalization to image-to-image translation.
arXiv Detail & Related papers (2021-03-18T17:59:11Z) - Diverse Semantic Image Synthesis via Probability Distribution Modeling [103.88931623488088]
We propose a novel diverse semantic image synthesis framework.
Our method can achieve superior diversity and comparable quality compared to state-of-the-art methods.
arXiv Detail & Related papers (2021-03-11T18:59:25Z) - Rethinking conditional GAN training: An approach using geometrically
structured latent manifolds [58.07468272236356]
Conditional GANs (cGAN) suffer from critical drawbacks such as the lack of diversity in generated outputs.
We propose a novel training mechanism that increases both the diversity and the visual quality of a vanilla cGAN.
arXiv Detail & Related papers (2020-11-25T22:54:11Z) - GANs with Variational Entropy Regularizers: Applications in Mitigating
the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples.
GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution.
We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z) - Multimodal Image-to-Image Translation via Mutual Information Estimation
and Maximization [16.54980086211836]
Multimodal image-to-image translation (I2IT) aims to learn a conditional distribution that explores multiple possible images in the target domain given an input image in the source domain.
Conditional generative adversarial networks (cGANs) are often adopted for modeling such a conditional distribution.
We propose a method that explicitly estimates and maximizes the mutual information between the latent code and the output image in cGANs.
arXiv Detail & Related papers (2020-08-08T14:09:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.