When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets
- URL: http://arxiv.org/abs/2506.19031v1
- Date: Mon, 23 Jun 2025 18:38:55 GMT
- Title: When Diffusion Models Memorize: Inductive Biases in Probability Flow of Minimum-Norm Shallow Neural Nets
- Authors: Chen Zeno, Hila Manor, Greg Ongie, Nir Weinberger, Tomer Michaeli, Daniel Soudry,
- Abstract summary: Key question is when probability flow converges to training samples or more general points on the data manifold.<n>We analyze this by studying the probability flow of shallow ReLU neural network denoisers trained with minimal $ell2$ norm.
- Score: 47.818753335400714
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While diffusion models generate high-quality images via probability flow, the theoretical understanding of this process remains incomplete. A key question is when probability flow converges to training samples or more general points on the data manifold. We analyze this by studying the probability flow of shallow ReLU neural network denoisers trained with minimal $\ell^2$ norm. For intuition, we introduce a simpler score flow and show that for orthogonal datasets, both flows follow similar trajectories, converging to a training point or a sum of training points. However, early stopping by the diffusion time scheduler allows probability flow to reach more general manifold points. This reflects the tendency of diffusion models to both memorize training samples and generate novel points that combine aspects of multiple samples, motivating our study of such behavior in simplified settings. We extend these results to obtuse simplex data and, through simulations in the orthogonal case, confirm that probability flow converges to a training point, a sum of training points, or a manifold point. Moreover, memorization decreases when the number of training samples grows, as fewer samples accumulate near training points.
Related papers
- Progressive Inference-Time Annealing of Diffusion Models for Sampling from Boltzmann Densities [85.83359661628575]
We propose Progressive Inference-Time Annealing (PITA) to learn diffusion-based samplers.<n>PITA combines two complementary techniques: Annealing of the Boltzmann distribution and Diffusion smoothing.<n>It enables equilibrium sampling of N-body particle systems, Alanine Dipeptide, and tripeptides in Cartesian coordinates.
arXiv Detail & Related papers (2025-06-19T17:14:22Z) - Resolving Memorization in Empirical Diffusion Model for Manifold Data in High-Dimensional Spaces [5.716752583983991]
When the data distribution consists of n points, empirical diffusion models tend to reproduce existing data points.<n>This work shows that the memorization issue can be solved simply by applying an inertia update at the end of the empirical diffusion simulation.<n>We demonstrate that the distribution of samples from this model approximates the true data distribution on a $C2$ manifold of dimension $d$, within a Wasserstein-1 distance of order $O(n-frac2d+4)$.
arXiv Detail & Related papers (2025-05-05T09:40:41Z) - Neural Flow Samplers with Shortcut Models [19.81513273510523]
Continuous flow-based neural samplers offer a promising approach to generate samples from unnormalized densities.<n>We introduce an improved estimator for these challenging quantities, employing a velocity-driven Sequential Monte Carlo method.<n>Our proposed Neural Flow Shortcut Sampler empirically outperforms existing flow-based neural samplers on both synthetic datasets and complex n-body system targets.
arXiv Detail & Related papers (2025-02-11T07:55:41Z) - No Trick, No Treat: Pursuits and Challenges Towards Simulation-free Training of Neural Samplers [41.867855070932706]
We consider the sampling problem, where the aim is to draw samples from a distribution whose density is known only up to a normalization constant.<n>Recent breakthroughs in generative modeling to approximate a high-dimensional data distribution have sparked significant interest in developing neural network-based methods for this challenging problem.<n>We propose an elegant modification to previous methods, which allows simulation-free training with the help of a time-dependent normalizing flow.
arXiv Detail & Related papers (2025-02-10T17:13:11Z) - A solvable generative model with a linear, one-step denoiser [0.0]
We develop an analytically tractable single-step diffusion model based on a linear denoiser.<n>We show that the monotonic fall phase of Kullback-Leibler divergence begins when the training dataset size reaches the dimension of the data points.
arXiv Detail & Related papers (2024-11-26T19:00:01Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.<n>We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Data Attribution for Diffusion Models: Timestep-induced Bias in Influence Estimation [53.27596811146316]
Diffusion models operate over a sequence of timesteps instead of instantaneous input-output relationships in previous contexts.
We present Diffusion-TracIn that incorporates this temporal dynamics and observe that samples' loss gradient norms are highly dependent on timestep.
We introduce Diffusion-ReTrac as a re-normalized adaptation that enables the retrieval of training samples more targeted to the test sample of interest.
arXiv Detail & Related papers (2024-01-17T07:58:18Z) - Diffusion Generative Flow Samplers: Improving learning signals through
partial trajectory optimization [87.21285093582446]
Diffusion Generative Flow Samplers (DGFS) is a sampling-based framework where the learning process can be tractably broken down into short partial trajectory segments.
Our method takes inspiration from the theory developed for generative flow networks (GFlowNets)
arXiv Detail & Related papers (2023-10-04T09:39:05Z) - Likelihood-Free Inference with Generative Neural Networks via Scoring
Rule Minimization [0.0]
Inference methods yield posterior approximations for simulator models with intractable likelihood.
Many works trained neural networks to approximate either the intractable likelihood or the posterior directly.
Here, we propose to approximate the posterior with generative networks trained by Scoring Rule minimization.
arXiv Detail & Related papers (2022-05-31T13:32:55Z) - Unrolling Particles: Unsupervised Learning of Sampling Distributions [102.72972137287728]
Particle filtering is used to compute good nonlinear estimates of complex systems.
We show in simulations that the resulting particle filter yields good estimates in a wide range of scenarios.
arXiv Detail & Related papers (2021-10-06T16:58:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.