PriorGrad: Improving Conditional Denoising Diffusion Models with
Data-Driven Adaptive Prior
- URL: http://arxiv.org/abs/2106.06406v1
- Date: Fri, 11 Jun 2021 14:04:03 GMT
- Title: PriorGrad: Improving Conditional Denoising Diffusion Models with
Data-Driven Adaptive Prior
- Authors: Sang-gil Lee, Heeseung Kim, Chaehun Shin, Xu Tan, Chang Liu, Qi Meng,
Tao Qin, Wei Chen, Sungroh Yoon, Tie-Yan Liu
- Abstract summary: We propose PriorGrad to improve the efficiency of the conditional diffusion model.
We show that PriorGrad achieves a faster convergence leading to data and parameter efficiency and improved quality.
- Score: 103.00403682863427
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Denoising diffusion probabilistic models have been recently proposed to
generate high-quality samples by estimating the gradient of the data density.
The framework assumes the prior noise as a standard Gaussian distribution,
whereas the corresponding data distribution may be more complicated than the
standard Gaussian distribution, which potentially introduces inefficiency in
denoising the prior noise into the data sample because of the discrepancy
between the data and the prior. In this paper, we propose PriorGrad to improve
the efficiency of the conditional diffusion model (for example, a vocoder using
a mel-spectrogram as the condition) by applying an adaptive prior derived from
the data statistics based on the conditional information. We formulate the
training and sampling procedures of PriorGrad and demonstrate the advantages of
an adaptive prior through a theoretical analysis. Focusing on the audio domain,
we consider the recently proposed diffusion-based audio generative models based
on both the spectral and time domains and show that PriorGrad achieves a faster
convergence leading to data and parameter efficiency and improved quality, and
thereby demonstrating the efficiency of a data-driven adaptive prior.
Related papers
- On the Relation Between Linear Diffusion and Power Iteration [42.158089783398616]
We study the generation process as a correlation machine''
We show that low frequencies emerge earlier in the generation process, where the denoising basis vectors are more aligned to the true data with a rate depending on their eigenvalues.
This model allows us to show that the linear diffusion model converges in mean to the leading eigenvector of the underlying data, similarly to the prevalent power iteration method.
arXiv Detail & Related papers (2024-10-16T07:33:12Z) - FIND: Fine-tuning Initial Noise Distribution with Policy Optimization for Diffusion Models [10.969811500333755]
We introduce a Fine-tuning Initial Noise Distribution (FIND) framework with policy optimization.
Our method achieves 10 times faster than the SOTA approach.
arXiv Detail & Related papers (2024-07-28T10:07:55Z) - Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Diffusion Models with Deterministic Normalizing Flow Priors [23.212848643552395]
We propose DiNof ($textbfDi$ffusion with $textbfNo$rmalizing $textbff$low priors), a technique that makes use of normalizing flows and diffusion models.
Experiments on standard image generation datasets demonstrate the advantage of the proposed method over existing approaches.
arXiv Detail & Related papers (2023-09-03T21:26:56Z) - Optimizing the Noise in Self-Supervised Learning: from Importance
Sampling to Noise-Contrastive Estimation [80.07065346699005]
It is widely assumed that the optimal noise distribution should be made equal to the data distribution, as in Generative Adversarial Networks (GANs)
We turn to Noise-Contrastive Estimation which grounds this self-supervised task as an estimation problem of an energy-based model of the data.
We soberly conclude that the optimal noise may be hard to sample from, and the gain in efficiency can be modest compared to choosing the noise distribution equal to the data's.
arXiv Detail & Related papers (2023-01-23T19:57:58Z) - From Denoising Diffusions to Denoising Markov Models [38.33676858989955]
Denoising diffusions are state-of-the-art generative models exhibiting remarkable empirical performance.
We propose a unifying framework generalising this approach to a wide class of spaces and leading to an original extension of score matching.
arXiv Detail & Related papers (2022-11-07T14:34:27Z) - The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators.
In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z) - Estimating High Order Gradients of the Data Distribution by Denoising [81.24581325617552]
First order derivative of a data density can be estimated efficiently by denoising score matching.
We propose a method to directly estimate high order derivatives (scores) of a data density from samples.
arXiv Detail & Related papers (2021-11-08T18:59:23Z) - Bayesian Imaging With Data-Driven Priors Encoded by Neural Networks:
Theory, Methods, and Algorithms [2.266704469122763]
This paper proposes a new methodology for performing Bayesian inference in imaging inverse problems where the prior knowledge is available in the form of training data.
We establish the existence and well-posedness of the associated posterior moments under easily verifiable conditions.
A model accuracy analysis suggests that the Bayesian probability probabilities reported by the data-driven models are also remarkably accurate under a frequentist definition.
arXiv Detail & Related papers (2021-03-18T11:34:08Z) - Learning Energy-Based Models by Diffusion Recovery Likelihood [61.069760183331745]
We present a diffusion recovery likelihood method to tractably learn and sample from a sequence of energy-based models.
After training, synthesized images can be generated by the sampling process that initializes from Gaussian white noise distribution.
On unconditional CIFAR-10 our method achieves FID 9.58 and inception score 8.30, superior to the majority of GANs.
arXiv Detail & Related papers (2020-12-15T07:09:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.