Debias the Training of Diffusion Models
- URL: http://arxiv.org/abs/2310.08442v2
- Date: Sat, 4 Nov 2023 02:32:44 GMT
- Title: Debias the Training of Diffusion Models
- Authors: Hu Yu, Li Shen, Jie Huang, Man Zhou, Hongsheng Li, Feng Zhao
- Abstract summary: We provide theoretical evidence that the prevailing practice of using a constant loss weight strategy in diffusion models leads to biased estimation during the training phase.
We propose an elegant and effective weighting strategy grounded in the theoretically unbiased principle.
These analyses are expected to advance our understanding and demystify the inner workings of diffusion models.
- Score: 53.49637348771626
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Diffusion models have demonstrated compelling generation quality by
optimizing the variational lower bound through a simple denoising score
matching loss. In this paper, we provide theoretical evidence that the
prevailing practice of using a constant loss weight strategy in diffusion
models leads to biased estimation during the training phase. Simply optimizing
the denoising network to predict Gaussian noise with constant weighting may
hinder precise estimations of original images. To address the issue, we propose
an elegant and effective weighting strategy grounded in the theoretically
unbiased principle. Moreover, we conduct a comprehensive and systematic
exploration to dissect the inherent bias problem deriving from constant
weighting loss from the perspectives of its existence, impact and reasons.
These analyses are expected to advance our understanding and demystify the
inner workings of diffusion models. Through empirical evaluation, we
demonstrate that our proposed debiased estimation method significantly enhances
sample quality without the reliance on complex techniques, and exhibits
improved efficiency compared to the baseline method both in training and
sampling processes.
Related papers
- PiRD: Physics-informed Residual Diffusion for Flow Field Reconstruction [5.06136344261226]
CNN-based methods for data fidelity enhancement rely on low-fidelity data patterns and distributions during the training phase.
Our proposed model - Physics-informed Residual Diffusion - demonstrates the capability to elevate the quality of data from both standard low-fidelity inputs.
Experimental results have shown that our approach can effectively reconstruct high-quality outcomes for two-dimensional turbulent flows without requiring retraining.
arXiv Detail & Related papers (2024-04-12T11:45:51Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in
Influence Estimation [58.20016784231991]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Observation-Guided Diffusion Probabilistic Models [41.749374023639156]
We propose a novel diffusion-based image generation method called the observation-guided diffusion probabilistic model (OGDM)
Our approach reestablishes the training objective by integrating the guidance of the observation process with the Markov chain.
We demonstrate the effectiveness of our training algorithm using diverse inference techniques on strong diffusion model baselines.
arXiv Detail & Related papers (2023-10-06T06:29:06Z) - The Surprising Effectiveness of Diffusion Models for Optical Flow and
Monocular Depth Estimation [42.48819460873482]
Denoising diffusion probabilistic models have transformed image generation with their impressive fidelity and diversity.
We show that they also excel in estimating optical flow and monocular depth, surprisingly, without task-specific architectures and loss functions.
arXiv Detail & Related papers (2023-06-02T21:26:20Z) - DiffusionAD: Norm-guided One-step Denoising Diffusion for Anomaly
Detection [89.49600182243306]
We reformulate the reconstruction process using a diffusion model into a noise-to-norm paradigm.
We propose a rapid one-step denoising paradigm, significantly faster than the traditional iterative denoising in diffusion models.
The segmentation sub-network predicts pixel-level anomaly scores using the input image and its anomaly-free restoration.
arXiv Detail & Related papers (2023-03-15T16:14:06Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - Deblurring via Stochastic Refinement [85.42730934561101]
We present an alternative framework for blind deblurring based on conditional diffusion models.
Our method is competitive in terms of distortion metrics such as PSNR.
arXiv Detail & Related papers (2021-12-05T04:36:09Z) - Towards Robust Waveform-Based Acoustic Models [41.82019240477273]
We propose an approach for learning robust acoustic models in adverse environments, characterized by a significant mismatch between training and test conditions.
Our approach is an instance of vicinal risk minimization, which aims to improve risk estimates during training by replacing the delta functions that define the empirical density over the input space with an approximation of the marginal population density in the vicinity of the training samples.
arXiv Detail & Related papers (2021-10-16T18:21:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.