Related papers: Improved Noise Schedule for Diffusion Training

Improved Noise Schedule for Diffusion Training

URL: http://arxiv.org/abs/2407.03297v2
Date: Wed, 27 Nov 2024 15:10:12 GMT
Title: Improved Noise Schedule for Diffusion Training
Authors: Tiankai Hang, Shuyang Gu, Xin Geng, Baining Guo,
Abstract summary: We propose a novel approach to design the noise schedule for enhancing the training of diffusion models.<n>We empirically demonstrate the superiority of our noise schedule over the standard cosine schedule.
Score: 51.849746576387375
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diffusion models have emerged as the de facto choice for generating high-quality visual signals across various domains. However, training a single model to predict noise across various levels poses significant challenges, necessitating numerous iterations and incurring significant computational costs. Various approaches, such as loss weighting strategy design and architectural refinements, have been introduced to expedite convergence and improve model performance. In this study, we propose a novel approach to design the noise schedule for enhancing the training of diffusion models. Our key insight is that the importance sampling of the logarithm of the Signal-to-Noise ratio ($\log \text{SNR}$), theoretically equivalent to a modified noise schedule, is particularly beneficial for training efficiency when increasing the sample frequency around $\log \text{SNR}=0$. This strategic sampling allows the model to focus on the critical transition point between signal dominance and noise dominance, potentially leading to more robust and accurate predictions.We empirically demonstrate the superiority of our noise schedule over the standard cosine schedule.Furthermore, we highlight the advantages of our noise schedule design on the ImageNet benchmark, showing that the designed schedule consistently benefits different prediction targets. Our findings contribute to the ongoing efforts to optimize diffusion models, potentially paving the way for more efficient and effective training paradigms in the field of generative AI.

Related papers

HADL Framework for Noise Resilient Long-Term Time Series Forecasting [0.7810572107832383]
Long-term time series forecasting is critical in domains such as finance, economics, and energy. The impact of temporal noise in extended lookback windows remains underexplored, often degrading model performance and computational efficiency. We propose a novel framework that addresses these challenges by integrating the Discrete Wavelet Transform (DWT) and Discrete Cosine Transform (DCT) Our approach demonstrates competitive robustness to noisy input, significantly reduces computational complexity, and achieves competitive or state-of-the-art forecasting performance across diverse benchmark datasets.
arXiv Detail & Related papers (2025-02-14T21:41:42Z)
Unveiling the Power of Noise Priors: Enhancing Diffusion Models for Mobile Traffic Prediction [11.091373697136047]
Noise shapes mobile traffic predictions, exhibiting distinct and consistent patterns. We propose NPDiff, a framework that decomposes noise into textitprior and textitresidual components. NPDiff can seamlessly integrate with various diffusion-based prediction models, delivering predictions that are effective, efficient, and robust.
arXiv Detail & Related papers (2025-01-23T16:13:08Z)
Constant Rate Schedule: Constant-Rate Distributional Change for Efficient Training and Sampling in Diffusion Models [16.863038973001483]
We propose a noise schedule that ensures a constant rate of change in the probability distribution of diffused data throughout the diffusion process. The functional form of the noise schedule is automatically determined and tailored to each dataset and type of diffusion model.
arXiv Detail & Related papers (2024-11-19T03:02:39Z)
Inference-Time Alignment of Diffusion Models with Direct Noise Optimization [45.77751895345154]
We propose a novel alignment approach, named Direct Noise Optimization (DNO), that optimize the injected noise during the sampling process of diffusion models. By design, DNO operates at inference-time, and thus is tuning-free and prompt-agnostic, with the alignment occurring in an online fashion during generation. We conduct extensive experiments on several important reward functions and demonstrate that the proposed DNO approach can achieve state-of-the-art reward scores within a reasonable time budget for generation.
arXiv Detail & Related papers (2024-05-29T08:39:39Z)
Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets. We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z)
Blue noise for diffusion models [50.99852321110366]
We introduce a novel and general class of diffusion models taking correlated noise within and across images into account. Our framework allows introducing correlation across images within a single mini-batch to improve gradient flow. We perform both qualitative and quantitative evaluations on a variety of datasets using our method.
arXiv Detail & Related papers (2024-02-07T14:59:25Z)
Not All Steps are Equal: Efficient Generation with Progressive Diffusion Models [62.155612146799314]
We propose a novel two-stage training strategy termed Step-Adaptive Training. In the initial stage, a base denoising model is trained to encompass all timesteps. We partition the timesteps into distinct groups, fine-tuning the model within each group to achieve specialized denoising capabilities.
arXiv Detail & Related papers (2023-12-20T03:32:58Z)
DiffSED: Sound Event Detection with Denoising Diffusion [70.18051526555512]
We reformulate the SED problem by taking a generative learning perspective. Specifically, we aim to generate sound temporal boundaries from noisy proposals in a denoising diffusion process. During training, our model learns to reverse the noising process by converting noisy latent queries to the groundtruth versions.
arXiv Detail & Related papers (2023-08-14T17:29:41Z)
Conditional Denoising Diffusion for Sequential Recommendation [62.127862728308045]
Two prominent generative models, Generative Adversarial Networks (GANs) and Variational AutoEncoders (VAEs) GANs suffer from unstable optimization, while VAEs are prone to posterior collapse and over-smoothed generations. We present a conditional denoising diffusion model, which includes a sequence encoder, a cross-attentive denoising decoder, and a step-wise diffuser.
arXiv Detail & Related papers (2023-04-22T15:32:59Z)
On the Importance of Noise Scheduling for Diffusion Models [8.360383061862844]
We study the effect of noise scheduling strategies for denoising diffusion generative models. This simple recipe yields state-of-the-art pixel-based diffusion models for high-resolution images on ImageNet.
arXiv Detail & Related papers (2023-01-26T07:37:22Z)
Self-Adapting Noise-Contrastive Estimation for Energy-Based Models [0.0]
Training energy-based models with noise-contrastive estimation (NCE) is theoretically feasible but practically challenging. Previous works have explored modelling the noise distribution as a separate generative model, and then concurrently training this noise model with the EBM. This thesis proposes a self-adapting NCE algorithm which uses static instances of the EBM along its training trajectory as the noise distribution.
arXiv Detail & Related papers (2022-11-03T15:17:43Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
Perception Prioritized Training of Diffusion Models [34.674477039333475]
We show that restoring data corrupted with certain noise levels offers a proper pretext for the model to learn rich visual concepts. We propose to prioritize such noise levels over other levels during training, by redesigning the weighting scheme of the objective function.
arXiv Detail & Related papers (2022-04-01T06:22:23Z)
The Optimal Noise in Noise-Contrastive Learning Is Not What You Think [80.07065346699005]
We show that deviating from this assumption can actually lead to better statistical estimators. In particular, the optimal noise distribution is different from the data's and even from a different family.
arXiv Detail & Related papers (2022-03-02T13:59:20Z)
A Study on Speech Enhancement Based on Diffusion Probabilistic Model [63.38586161802788]
We propose a diffusion probabilistic model-based speech enhancement model (DiffuSE) model that aims to recover clean speech signals from noisy signals. The experimental results show that DiffuSE yields performance that is comparable to related audio generative models on the standardized Voice Bank corpus task.
arXiv Detail & Related papers (2021-07-25T19:23:18Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.