Generalization Dynamics of Linear Diffusion Models
- URL: http://arxiv.org/abs/2505.24769v1
- Date: Fri, 30 May 2025 16:31:58 GMT
- Title: Generalization Dynamics of Linear Diffusion Models
- Authors: Claudia Merger, Sebastian Goldt,
- Abstract summary: We analytically study the memorisation-to-generalisation transition in a simple model using linear denoisers.<n>Our work clarifies how sample complexity governs generalisation in a simple model of diffusion-based generative models.
- Score: 8.107431208836426
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion models trained on finite datasets with $N$ samples from a target distribution exhibit a transition from memorisation, where the model reproduces training examples, to generalisation, where it produces novel samples that reflect the underlying data distribution. Understanding this transition is key to characterising the sample efficiency and reliability of generative models, but our theoretical understanding of this transition is incomplete. Here, we analytically study the memorisation-to-generalisation transition in a simple model using linear denoisers, which allow explicit computation of test errors, sampling distributions, and Kullback-Leibler divergences between samples and target distribution. Using these measures, we predict that this transition occurs roughly when $N \asymp d$, the dimension of the inputs. When $N$ is smaller than the dimension of the inputs $d$, so that only a fraction of relevant directions of variation are present in the training data, we demonstrate how both regularization and early stopping help to prevent overfitting. For $N > d$, we find that the sampling distributions of linear diffusion models approach their optimum (measured by the Kullback-Leibler divergence) linearly with $d/N$, independent of the specifics of the data distribution. Our work clarifies how sample complexity governs generalisation in a simple model of diffusion-based generative models and provides insight into the training dynamics of linear denoisers.
Related papers
- Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces [5.716752583983991]
When the data distribution consists of n points, empirical diffusion models tend to reproduce existing data points.<n>This work shows that the memorization issue can be solved simply by applying an inertia update at the end of the empirical diffusion simulation.<n>We demonstrate that the distribution of samples from this model approximates the true data distribution on a $C2$ manifold of dimension $d$, within a Wasserstein-1 distance of order $O(n-frac2d+4)$.
arXiv Detail & Related papers (2025-05-05T09:40:41Z) - Non-Normal Diffusion Models [3.5534933448684134]
Diffusion models generate samples by incrementally reversing a process that turns data into noise.<n>We show that when the step size goes to zero, the reversed process is invariant to the distribution of these increments.<n>We demonstrate the effectiveness of these models on density estimation and generative modeling tasks on standard image datasets.
arXiv Detail & Related papers (2024-12-10T21:31:12Z) - A solvable generative model with a linear, one-step denoiser [0.0]
We develop an analytically tractable single-step diffusion model based on a linear denoiser.<n>We show that the monotonic fall phase of Kullback-Leibler divergence begins when the training dataset size reaches the dimension of the data points.
arXiv Detail & Related papers (2024-11-26T19:00:01Z) - Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure [8.320632531909682]
We study the generalizability of diffusion models by looking into the hidden properties of the learned score functions.<n>As diffusion models transition from memorization to generalization, their corresponding nonlinear diffusion denoisers exhibit increasing linearity.
arXiv Detail & Related papers (2024-10-31T15:57:04Z) - Constrained Diffusion Models via Dual Training [80.03953599062365]
Diffusion processes are prone to generating samples that reflect biases in a training dataset.
We develop constrained diffusion models by imposing diffusion constraints based on desired distributions.
We show that our constrained diffusion models generate new data from a mixture data distribution that achieves the optimal trade-off among objective and constraints.
arXiv Detail & Related papers (2024-08-27T14:25:42Z) - Informed Correctors for Discrete Diffusion Models [31.814439169033616]
We propose a predictor-corrector sampling scheme where the corrector is informed by the diffusion model to more reliably counter the accumulating approximation errors.<n>On tokenized ImageNet 256x256, this approach consistently produces superior samples with fewer steps, achieving improved FID scores for discrete diffusion models.
arXiv Detail & Related papers (2024-07-30T23:29:29Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.<n>We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian
Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties.
This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z) - Lecture Notes in Probabilistic Diffusion Models [0.5361320134021585]
Diffusion models are loosely modelled based on non-equilibrium thermodynamics.
The diffusion model learns the data manifold to which the original and thus the reconstructed data samples belong.
Diffusion models have -- unlike variational autoencoder and flow models -- latent variables with the same dimensionality as the original data.
arXiv Detail & Related papers (2023-12-16T09:36:54Z) - Diffusion models for probabilistic programming [56.47577824219207]
Diffusion Model Variational Inference (DMVI) is a novel method for automated approximate inference in probabilistic programming languages (PPLs)
DMVI is easy to implement, allows hassle-free inference in PPLs without the drawbacks of, e.g., variational inference using normalizing flows, and does not make any constraints on the underlying neural network model.
arXiv Detail & Related papers (2023-11-01T12:17:05Z) - Generative Diffusion From An Action Principle [0.0]
We show that score matching can be derived from an action principle, like the ones commonly used in physics.
We use this insight to demonstrate the connection between different classes of diffusion models.
arXiv Detail & Related papers (2023-10-06T18:00:00Z) - On Error Propagation of Diffusion Models [77.91480554418048]
We develop a theoretical framework to mathematically formulate error propagation in the architecture of DMs.
We apply the cumulative error as a regularization term to reduce error propagation.
Our proposed regularization reduces error propagation, significantly improves vanilla DMs, and outperforms previous baselines.
arXiv Detail & Related papers (2023-08-09T15:31:17Z) - Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative
Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models.
In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z) - Reflected Diffusion Models [93.26107023470979]
We present Reflected Diffusion Models, which reverse a reflected differential equation evolving on the support of the data.
Our approach learns the score function through a generalized score matching loss and extends key components of standard diffusion models.
arXiv Detail & Related papers (2023-04-10T17:54:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.