Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
- URL: http://arxiv.org/abs/2112.07804v1
- Date: Wed, 15 Dec 2021 00:09:38 GMT
- Title: Tackling the Generative Learning Trilemma with Denoising Diffusion GANs
- Authors: Zhisheng Xiao, Karsten Kreis, Arash Vahdat
- Abstract summary: Deep generative models often struggle with simultaneously addressing high sample quality, mode coverage, and fast sampling.
We call the challenge the generative learning trilemma, as the existing models often trade some of them for others.
We introduce denoising diffusion generative adversarial networks (denoising diffusion GANs) that model each denoising step using a multimodal conditional GAN.
- Score: 20.969702008187838
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A wide variety of deep generative models has been developed in the past
decade. Yet, these models often struggle with simultaneously addressing three
key requirements including: high sample quality, mode coverage, and fast
sampling. We call the challenge imposed by these requirements the generative
learning trilemma, as the existing models often trade some of them for others.
Particularly, denoising diffusion models have shown impressive sample quality
and diversity, but their expensive sampling does not yet allow them to be
applied in many real-world applications. In this paper, we argue that slow
sampling in these models is fundamentally attributed to the Gaussian assumption
in the denoising step which is justified only for small step sizes. To enable
denoising with large steps, and hence, to reduce the total number of denoising
steps, we propose to model the denoising distribution using a complex
multimodal distribution. We introduce denoising diffusion generative
adversarial networks (denoising diffusion GANs) that model each denoising step
using a multimodal conditional GAN. Through extensive evaluations, we show that
denoising diffusion GANs obtain sample quality and diversity competitive with
original diffusion models while being 2000$\times$ faster on the CIFAR-10
dataset. Compared to traditional GANs, our model exhibits better mode coverage
and sample diversity. To the best of our knowledge, denoising diffusion GAN is
the first model that reduces sampling cost in diffusion models to an extent
that allows them to be applied to real-world applications inexpensively.
Project page and code: https://nvlabs.github.io/denoising-diffusion-gan
Related papers
- Discrete Copula Diffusion [44.96934660818884]
We identify a fundamental limitation that prevents discrete diffusion models from achieving strong performance with fewer steps.
We introduce a general approach to supplement the missing dependency information by incorporating another deep generative model, termed the copula model.
Our method does not require fine-tuning either the diffusion model or the copula model, yet it enables high-quality sample generation with significantly fewer denoising steps.
arXiv Detail & Related papers (2024-10-02T18:51:38Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - Soft Mixture Denoising: Beyond the Expressive Bottleneck of Diffusion
Models [76.46246743508651]
We show that current diffusion models actually have an expressive bottleneck in backward denoising.
We introduce soft mixture denoising (SMD), an expressive and efficient model for backward denoising.
arXiv Detail & Related papers (2023-09-25T12:03:32Z) - Semi-Implicit Denoising Diffusion Models (SIDDMs) [50.30163684539586]
Existing models such as Denoising Diffusion Probabilistic Models (DDPM) deliver high-quality, diverse samples but are slowed by an inherently high number of iterative steps.
We introduce a novel approach that tackles the problem by matching implicit and explicit factors.
We demonstrate that our proposed method obtains comparable generative performance to diffusion-based models and vastly superior results to models with a small number of sampling steps.
arXiv Detail & Related papers (2023-06-21T18:49:22Z) - Real-World Denoising via Diffusion Model [14.722529440511446]
Real-world image denoising aims to recover clean images from noisy images captured in natural environments.
diffusion models have achieved very promising results in the field of image generation, outperforming previous generation models.
This paper proposes a novel general denoising diffusion model that can be used for real-world image denoising.
arXiv Detail & Related papers (2023-05-08T04:48:03Z) - Fast Inference in Denoising Diffusion Models via MMD Finetuning [23.779985842891705]
We present MMD-DDM, a novel method for fast sampling of diffusion models.
Our approach is based on the idea of using the Maximum Mean Discrepancy (MMD) to finetune the learned distribution with a given budget of timesteps.
Our findings show that the proposed method is able to produce high-quality samples in a fraction of the time required by widely-used diffusion models.
arXiv Detail & Related papers (2023-01-19T09:48:07Z) - On Distillation of Guided Diffusion Models [94.95228078141626]
We propose an approach to distilling classifier-free guided diffusion models into models that are fast to sample from.
For standard diffusion models trained on the pixelspace, our approach is able to generate images visually comparable to that of the original model.
For diffusion models trained on the latent-space (e.g., Stable Diffusion), our approach is able to generate high-fidelity images using as few as 1 to 4 denoising steps.
arXiv Detail & Related papers (2022-10-06T18:03:56Z) - Diffusion Models in Vision: A Survey [80.82832715884597]
A diffusion model is a deep generative model that is based on two stages, a forward diffusion stage and a reverse diffusion stage.
Diffusion models are widely appreciated for the quality and diversity of the generated samples, despite their known computational burdens.
arXiv Detail & Related papers (2022-09-10T22:00:30Z) - Dynamic Dual-Output Diffusion Models [100.32273175423146]
Iterative denoising-based generation has been shown to be comparable in quality to other classes of generative models.
A major drawback of this method is that it requires hundreds of iterations to produce a competitive result.
Recent works have proposed solutions that allow for faster generation with fewer iterations, but the image quality gradually deteriorates.
arXiv Detail & Related papers (2022-03-08T11:20:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.