Related papers: Diffusion Model's Generalization Can Be Characterized by Inductive Biases toward a Data-Dependent Ridge Manifold

Diffusion Model's Generalization Can Be Characterized by Inductive Biases toward a Data-Dependent Ridge Manifold

URL: http://arxiv.org/abs/2602.06021v1
Date: Thu, 05 Feb 2026 18:55:03 GMT
Title: Diffusion Model's Generalization Can Be Characterized by Inductive Biases toward a Data-Dependent Ridge Manifold
Authors: Ye He, Yitong Qiu, Molei Tao,
Abstract summary: We explicitly characterize what diffusion model generates, by proposing a log-density ridge manifold.<n>We show how the generated data relate to this manifold as inference dynamics progresses.<n>More detailed understanding of training dynamics will lead to more accurate quantification of the generation inductive bias.
Score: 19.059115911590776
License: http://creativecommons.org/licenses/by/4.0/
Abstract: When a diffusion model is not memorizing the training data set, how does it generalize exactly? A quantitative understanding of the distribution it generates would be beneficial to, for example, an assessment of the model's performance for downstream applications. We thus explicitly characterize what diffusion model generates, by proposing a log-density ridge manifold and quantifying how the generated data relate to this manifold as inference dynamics progresses. More precisely, inference undergoes a reach-align-slide process centered around the ridge manifold: trajectories first reach a neighborhood of the manifold, then align as being pushed toward or away from the manifold in normal directions, and finally slide along the manifold in tangent directions. Within the scope of this general behavior, different training errors will lead to different normal and tangent motions, which can be quantified, and these detailed motions characterize when inter-mode generations emerge. More detailed understanding of training dynamics will lead to more accurate quantification of the generation inductive bias, and an example of random feature model will be considered, for which we can explicitly illustrate how diffusion model's inductive biases originate as a composition of architectural bias and training accuracy, and how they evolve with the inference dynamics. Experiments on synthetic multimodal distributions and MNIST latent diffusion support the predicted directional effects, in both low- and high-dimensions.

Related papers

The Principles of Diffusion Models [81.12042238390075]
Diffusion modeling starts by defining a forward process that gradually corrupts data into noise.<n>The goal is to learn a reverse process that transforms noise back into data while recovering the same intermediates.<n>The score-based view, rooted in energy-based modeling, learns the gradient of the evolving data distribution.<n>The flow-based view, related to normalizing flows, treats generation as following a smooth path that moves samples from noise to data.
arXiv Detail & Related papers (2025-10-24T02:29:02Z)
Overcoming Dimensional Factorization Limits in Discrete Diffusion Models through Quantum Joint Distribution Learning [79.65014491424151]
We propose a quantum Discrete Denoising Diffusion Probabilistic Model (QD3PM)<n>It enables joint probability learning through diffusion and denoising in exponentially large Hilbert spaces.<n>This paper establishes a new theoretical paradigm in generative models by leveraging the quantum advantage in joint distribution learning.
arXiv Detail & Related papers (2025-05-08T11:48:21Z)
Generalization through variance: how noise shapes inductive biases in diffusion models [0.0]
We develop a mathematical theory that partly explains 'generalization through variance' phenomenon.<n>We find that the distributions diffusion models effectively learn to sample from resemble their training distributions.<n>We also characterize how this inductive bias interacts with feature-related inductive biases.
arXiv Detail & Related papers (2025-04-16T23:41:10Z)
An Analytical Theory of Spectral Bias in the Learning Dynamics of Diffusion Models [29.972063833424215]
We develop an analytical framework for understanding how the generated distribution evolves during diffusion model training.<n>We integrate the resulting probability-flow ODE, yielding analytic expressions for the generated distribution.
arXiv Detail & Related papers (2025-03-05T05:50:38Z)
Continuous Diffusion Model for Language Modeling [64.7425225935854]
Existing continuous diffusion models for discrete data underperform compared to discrete methods.<n>We propose a continuous diffusion model for language modeling that incorporates the geometry of the underlying categorical distribution.<n>Our method outperforms existing discrete diffusion models and approaches the performance of autoregressive models.
arXiv Detail & Related papers (2025-02-17T08:54:29Z)
Understanding Generalizability of Diffusion Models Requires Rethinking the Hidden Gaussian Structure [8.320632531909682]
We study the generalizability of diffusion models by looking into the hidden properties of the learned score functions.<n>As diffusion models transition from memorization to generalization, their corresponding nonlinear diffusion denoisers exhibit increasing linearity.
arXiv Detail & Related papers (2024-10-31T15:57:04Z)
Theoretical Insights for Diffusion Guidance: A Case Study for Gaussian Mixture Models [59.331993845831946]
Diffusion models benefit from instillation of task-specific information into the score function to steer the sample generation towards desired properties. This paper provides the first theoretical study towards understanding the influence of guidance on diffusion models in the context of Gaussian mixture models.
arXiv Detail & Related papers (2024-03-03T23:15:48Z)
On the Generalization Properties of Diffusion Models [31.067038651873126]
This work embarks on a comprehensive theoretical exploration of the generalization attributes of diffusion models.<n>We establish theoretical estimates of the generalization gap that evolves in tandem with the training dynamics of score-based diffusion models.<n>We extend our quantitative analysis to a data-dependent scenario, wherein target distributions are portrayed as a succession of densities.
arXiv Detail & Related papers (2023-11-03T09:20:20Z)
Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes [57.396578974401734]
We introduce a principled framework for building a generative diffusion process on general manifold. Instead of following the denoising approach of previous diffusion models, we construct a diffusion process using a mixture of bridge processes. We develop a geometric understanding of the mixture process, deriving the drift as a weighted mean of tangent directions to the data points.
arXiv Detail & Related papers (2023-10-11T06:04:40Z)
Diff-Instruct: A Universal Approach for Transferring Knowledge From Pre-trained Diffusion Models [77.83923746319498]
We propose a framework called Diff-Instruct to instruct the training of arbitrary generative models. We show that Diff-Instruct results in state-of-the-art single-step diffusion-based models. Experiments on refining GAN models show that the Diff-Instruct can consistently improve the pre-trained generators of GAN models.
arXiv Detail & Related papers (2023-05-29T04:22:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.