A precise asymptotic analysis of learning diffusion models: theory and insights
- URL: http://arxiv.org/abs/2501.03937v1
- Date: Tue, 07 Jan 2025 16:56:40 GMT
- Title: A precise asymptotic analysis of learning diffusion models: theory and insights
- Authors: Hugo Cui, Cengiz Pehlevan, Yue M. Lu,
- Abstract summary: We consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder.
We derive a tight characterization of low-dimensional projections of the distribution of samples generated by the learned model.
- Score: 37.30894159200853
- License:
- Abstract: In this manuscript, we consider the problem of learning a flow or diffusion-based generative model parametrized by a two-layer auto-encoder, trained with online stochastic gradient descent, on a high-dimensional target density with an underlying low-dimensional manifold structure. We derive a tight asymptotic characterization of low-dimensional projections of the distribution of samples generated by the learned model, ascertaining in particular its dependence on the number of training samples. Building on this analysis, we discuss how mode collapse can arise, and lead to model collapse when the generative model is re-trained on generated synthetic data.
Related papers
- Elucidating Flow Matching ODE Dynamics with Respect to Data Geometries [10.947094609205765]
Diffusion-based generative models have become the standard for image generation. ODE-based samplers and flow matching models improve efficiency, in comparison to diffusion models, by reducing sampling steps through learned vector fields.
We advance the theory of flow matching models through a comprehensive analysis of sample trajectories, centered on the denoiser that drives ODE dynamics.
Our analysis reveals how trajectories evolve from capturing global data features to local structures, providing the geometric characterization of per-sample behavior in flow matching models.
arXiv Detail & Related papers (2024-12-25T01:17:15Z) - Provable Statistical Rates for Consistency Diffusion Models [87.28777947976573]
Despite the state-of-the-art performance, diffusion models are known for their slow sample generation due to the extensive number of steps involved.
This paper contributes towards the first statistical theory for consistency models, formulating their training as a distribution discrepancy minimization problem.
arXiv Detail & Related papers (2024-06-23T20:34:18Z) - Latent diffusion models for parameterization and data assimilation of facies-based geomodels [0.0]
Diffusion models are trained to generate new geological realizations from input fields characterized by random noise.
Latent diffusion models are shown to provide realizations that are visually consistent with samples from geomodeling software.
arXiv Detail & Related papers (2024-06-21T01:32:03Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - A Geometric Perspective on Diffusion Models [57.27857591493788]
We inspect the ODE-based sampling of a popular variance-exploding SDE.
We establish a theoretical relationship between the optimal ODE-based sampling and the classic mean-shift (mode-seeking) algorithm.
arXiv Detail & Related papers (2023-05-31T15:33:16Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Diagnosing and Fixing Manifold Overfitting in Deep Generative Models [11.82509693248749]
Likelihood-based, or explicit, deep generative models use neural networks to construct flexible high-dimensional densities.
We show that observed data lies on a low-dimensional manifold embedded in high-dimensional ambient space.
We propose a class of two-step procedures consisting of a dimensionality reduction step followed by maximum-likelihood density estimation.
arXiv Detail & Related papers (2022-04-14T18:00:03Z) - A likelihood approach to nonparametric estimation of a singular
distribution using deep generative models [4.329951775163721]
We investigate a likelihood approach to nonparametric estimation of a singular distribution using deep generative models.
We prove that a novel and effective solution exists by perturbing the data with an instance noise.
We also characterize the class of distributions that can be efficiently estimated via deep generative models.
arXiv Detail & Related papers (2021-05-09T23:13:58Z) - Improving the Reconstruction of Disentangled Representation Learners via Multi-Stage Modeling [54.94763543386523]
Current autoencoder-based disentangled representation learning methods achieve disentanglement by penalizing the ( aggregate) posterior to encourage statistical independence of the latent factors.
We present a novel multi-stage modeling approach where the disentangled factors are first learned using a penalty-based disentangled representation learning method.
Then, the low-quality reconstruction is improved with another deep generative model that is trained to model the missing correlated latent variables.
arXiv Detail & Related papers (2020-10-25T18:51:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.