Diffusion Models With Learned Adaptive Noise
- URL: http://arxiv.org/abs/2312.13236v3
- Date: Sun, 10 Nov 2024 20:02:06 GMT
- Title: Diffusion Models With Learned Adaptive Noise
- Authors: Subham Sekhar Sahoo, Aaron Gokaslan, Chris De Sa, Volodymyr Kuleshov,
- Abstract summary: In this paper, we explore whether the diffusion process can be learned from data.
A widely held assumption is that the ELBO is invariant to the noise process.
We propose MULAN, a learned diffusion process that applies noise at different rates across an image.
- Score: 12.530583016267768
- License:
- Abstract: Diffusion models have gained traction as powerful algorithms for synthesizing high-quality images. Central to these algorithms is the diffusion process, a set of equations which maps data to noise in a way that can significantly affect performance. In this paper, we explore whether the diffusion process can be learned from data. Our work is grounded in Bayesian inference and seeks to improve log-likelihood estimation by casting the learned diffusion process as an approximate variational posterior that yields a tighter lower bound (ELBO) on the likelihood. A widely held assumption is that the ELBO is invariant to the noise process: our work dispels this assumption and proposes multivariate learned adaptive noise (MULAN), a learned diffusion process that applies noise at different rates across an image. Specifically, our method relies on a multivariate noise schedule that is a function of the data to ensure that the ELBO is no longer invariant to the choice of the noise schedule as in previous works. Empirically, MULAN sets a new state-of-the-art in density estimation on CIFAR-10 and ImageNet and reduces the number of training steps by 50%. We provide the code, along with a blog post and video tutorial on the project page: https://s-sahoo.com/MuLAN
Related papers
- Diffusion Priors for Variational Likelihood Estimation and Image Denoising [10.548018200066858]
We propose adaptive likelihood estimation and MAP inference during the reverse diffusion process to tackle real-world noise.
Experiments and analyses on diverse real-world datasets demonstrate the effectiveness of our method.
arXiv Detail & Related papers (2024-10-23T02:52:53Z) - Immiscible Diffusion: Accelerating Diffusion Training with Noise Assignment [56.609042046176555]
suboptimal noise-data mapping leads to slow training of diffusion models.
Drawing inspiration from the immiscibility phenomenon in physics, we propose Immiscible Diffusion.
Our approach is remarkably simple, requiring only one line of code to restrict the diffuse-able area for each image.
arXiv Detail & Related papers (2024-06-18T06:20:42Z) - Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account.
Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow.
We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z) - Denoising Diffusion Bridge Models [54.87947768074036]
Diffusion models are powerful generative models that map noise to data using processes.
For many applications such as image editing, the model input comes from a distribution that is not random noise.
In our work, we propose Denoising Diffusion Bridge Models (DDBMs)
arXiv Detail & Related papers (2023-09-29T03:24:24Z) - VideoFusion: Decomposed Diffusion Models for High-Quality Video
Generation [88.49030739715701]
This work presents a decomposed diffusion process via resolving the per-frame noise into a base noise that is shared among all frames and a residual noise that varies along the time axis.
Experiments on various datasets confirm that our approach, termed as VideoFusion, surpasses both GAN-based and diffusion-based alternatives in high-quality video generation.
arXiv Detail & Related papers (2023-03-15T02:16:39Z) - Denoising Diffusion Gamma Models [91.22679787578438]
We introduce the Denoising Diffusion Gamma Model (DDGM) and show that noise from Gamma distribution provides improved results for image and speech generation.
Our approach preserves the ability to efficiently sample state in the training diffusion process while using Gamma noise.
arXiv Detail & Related papers (2021-10-10T10:46:31Z) - Non Gaussian Denoising Diffusion Models [91.22679787578438]
We show that noise from Gamma distribution provides improved results for image and speech generation.
We also show that using a mixture of Gaussian noise variables in the diffusion process improves the performance over a diffusion process that is based on a single distribution.
arXiv Detail & Related papers (2021-06-14T16:42:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.