Related papers: An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models

URL: http://arxiv.org/abs/2503.03206v2
Date: Thu, 14 Aug 2025 21:54:26 GMT
Title: An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models
Authors: Binxu Wang, Cengiz Pehlevan,
Abstract summary: We develop an analytical framework for understanding how the generated distribution evolves during diffusion model training.<n>We integrate the resulting probability-flow ODE, yielding analytic expressions for the generated distribution.
Score: 29.972063833424215
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We develop an analytical framework for understanding how the generated distribution evolves during diffusion model training. Leveraging a Gaussian-equivalence principle, we solve the full-batch gradient-flow dynamics of linear and convolutional denoisers and integrate the resulting probability-flow ODE, yielding analytic expressions for the generated distribution. The theory exposes a universal inverse-variance spectral law: the time for an eigen- or Fourier mode to match its target variance scales as $\tau\propto\lambda^{-1}$, so high-variance (coarse) structure is mastered orders of magnitude sooner than low-variance (fine) detail. Extending the analysis to deep linear networks and circulant full-width convolutions shows that weight sharing merely multiplies learning rates accelerating but not eliminating the bias whereas local convolution introduces a qualitatively different bias. Experiments on Gaussian and natural-image datasets confirm the spectral law persists in deep MLP-based UNet. Convolutional U-Nets, however, display rapid near-simultaneous emergence of many modes, implicating local convolution in reshaping learning dynamics. These results underscore how data covariance governs the order and speed with which diffusion models learn, and they call for deeper investigation of the unique inductive biases introduced by local convolution.

Related papers

Rethinking Diffusion Models with Symmetries through Canonicalization with Applications to Molecular Graph Generation [56.361076943802594]
CanonFlow achieves state-of-the-art performance on the challenging GEOM-DRUG dataset, and the advantage remains large in few-step generation.
arXiv Detail & Related papers (2026-02-16T18:58:55Z)
Diffusion Model's Generalization Can Be Characterized by Inductive Biases toward a Data-Dependent Ridge Manifold [19.059115911590776]
We explicitly characterize what diffusion model generates, by proposing a log-density ridge manifold.<n>We show how the generated data relate to this manifold as inference dynamics progresses.<n>More detailed understanding of training dynamics will lead to more accurate quantification of the generation inductive bias.
arXiv Detail & Related papers (2026-02-05T18:55:03Z)
A Random Matrix Theory Perspective on the Consistency of Diffusion Models [31.63433424187031]
Diffusion models trained on different subsets of a dataset often produce strikingly similar outputs when given the same noise seed.<n>We develop a random matrix theory (RMT) framework that quantifies how finite shape the expectation and variance of the learned denoiser and sampling map.<n>We validate its predictions on UNet and DiT architectures in their non-memorization regime.
arXiv Detail & Related papers (2026-02-02T23:30:28Z)
An Elementary Approach to Scheduling in Generative Diffusion Models [55.171367482496755]
An elementary approach to characterizing the impact of noise scheduling and time discretization in generative diffusion models is developed.<n> Experiments across different datasets and pretrained models demonstrate that the time discretization strategy selected by our approach consistently outperforms baseline and search-based strategies.
arXiv Detail & Related papers (2026-01-20T05:06:26Z)
EquiReg: Equivariance Regularized Diffusion for Inverse Problems [67.01847869495558]
We propose EquiReg diffusion, a framework for regularizing posterior sampling in diffusion-based inverse problem solvers.<n>When applied to a variety of solvers, EquiReg outperforms state-of-the-art diffusion models in both linear and nonlinear image restoration tasks.
arXiv Detail & Related papers (2025-05-29T01:25:43Z)
Kernel-Smoothed Scores for Denoising Diffusion: A Bias-Variance Study [3.265950484493743]
Diffusion models can be prone to memorization.<n>Regularization on the score has the same effect as increasing the size of the training dataset.<n>This perspective highlights two regularization mechanisms taking place in denoising diffusions.
arXiv Detail & Related papers (2025-05-28T20:22:18Z)
Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning [79.65014491424151]
We propose a quantum Discrete Denoising Diffusion Probabilistic Model (QD3PM)<n>It enables joint probability learning through diffusion and denoising in exponentially large Hilbert spaces.<n>This paper establishes a new theoretical paradigm in generative models by leveraging the quantum advantage in joint distribution learning.
arXiv Detail & Related papers (2025-05-08T11:48:21Z)
Generalization through variance: how noise shapes inductive biases in diffusion models [0.0]
We develop a mathematical theory that partly explains 'generalization through variance' phenomenon. We find that the distributions diffusion models effectively learn to sample from resemble their training distributions. We also characterize how this inductive bias interacts with feature-related inductive biases.
arXiv Detail & Related papers (2025-04-16T23:41:10Z)
Critical Iterative Denoising: A Discrete Generative Model Applied to Graphs [52.50288418639075]
We propose a novel framework called Iterative Denoising, which simplifies discrete diffusion and circumvents the issue by assuming conditional independence across time. Our empirical evaluations demonstrate that the proposed method significantly outperforms existing discrete diffusion baselines in graph generation tasks.
arXiv Detail & Related papers (2025-03-27T15:08:58Z)
Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure [8.320632531909682]
We study the generalizability of diffusion models by looking into the hidden properties of the learned score functions.<n>As diffusion models transition from memorization to generalization, their corresponding nonlinear diffusion denoisers exhibit increasing linearity.
arXiv Detail & Related papers (2024-10-31T15:57:04Z)
On the Wasserstein Convergence and Straightness of Rectified Flow [54.580605276017096]
Rectified Flow (RF) is a generative model that aims to learn straight flow trajectories from noise to data.<n>We provide a theoretical analysis of the Wasserstein distance between the sampling distribution of RF and the target distribution.<n>We present general conditions guaranteeing uniqueness and straightness of 1-RF, which is in line with previous empirical findings.
arXiv Detail & Related papers (2024-10-19T02:36:11Z)
On the Relation Between Linear Diffusion and Power Iteration [42.158089783398616]
We study the generation process as a correlation machine'' We show that low frequencies emerge earlier in the generation process, where the denoising basis vectors are more aligned to the true data with a rate depending on their eigenvalues. This model allows us to show that the linear diffusion model converges in mean to the leading eigenvector of the underlying data, similarly to the prevalent power iteration method.
arXiv Detail & Related papers (2024-10-16T07:33:12Z)
How Discrete and Continuous Diffusion Meet: Comprehensive Analysis of Discrete Diffusion Models via a Stochastic Integral Framework [11.71206628091551]
We propose a comprehensive framework for the error analysis of discrete diffusion models based on L'evy-type integrals. Our framework unifies and strengthens the current theoretical results on discrete diffusion models.
arXiv Detail & Related papers (2024-10-04T16:59:29Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
On Error Propagation of Diffusion Models [77.91480554418048]
We develop a theoretical framework to mathematically formulate error propagation in the architecture of DMs. We apply the cumulative error as a regularization term to reduce error propagation. Our proposed regularization reduces error propagation, significantly improves vanilla DMs, and outperforms previous baselines.
arXiv Detail & Related papers (2023-08-09T15:31:17Z)
A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE. We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z)
A Variational Perspective on Solving Inverse Problems with Diffusion Models [101.831766524264]
Inverse tasks can be formulated as inferring a posterior distribution over data. This is however challenging in diffusion models since the nonlinear and iterative nature of the diffusion process renders the posterior intractable. We propose a variational approach that by design seeks to approximate the true posterior distribution.
arXiv Detail & Related papers (2023-05-07T23:00:47Z)
Diffusion Models are Minimax Optimal Distribution Estimators [49.47503258639454]
We provide the first rigorous analysis on approximation and generalization abilities of diffusion modeling. We show that when the true density function belongs to the Besov space and the empirical score matching loss is properly minimized, the generated data distribution achieves the nearly minimax optimal estimation rates.
arXiv Detail & Related papers (2023-03-03T11:31:55Z)
Information-Theoretic Diffusion [18.356162596599436]
Denoising diffusion models have spurred significant gains in density modeling and image generation. We introduce a new mathematical foundation for diffusion models inspired by classic results in information theory.
arXiv Detail & Related papers (2023-02-07T23:03:07Z)
How Much is Enough? A Study on Diffusion Times in Score-based Generative Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution. We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z)
Diffusion-GAN: Training GANs with Diffusion [135.24433011977874]
Generative adversarial networks (GANs) are challenging to train stably. We propose Diffusion-GAN, a novel GAN framework that leverages a forward diffusion chain to generate instance noise. We show that Diffusion-GAN can produce more realistic images with higher stability and data efficiency than state-of-the-art GANs.
arXiv Detail & Related papers (2022-06-05T20:45:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.