Dynamical Regimes of Diffusion Models
- URL: http://arxiv.org/abs/2402.18491v1
- Date: Wed, 28 Feb 2024 17:19:26 GMT
- Title: Dynamical Regimes of Diffusion Models
- Authors: Giulio Biroli, Tony Bonnaire, Valentin de Bortoli, Marc M\'ezard
- Abstract summary: We study generative diffusion models in the regime where the dimension of space and the number of data are large.
Our analysis reveals three distinct dynamical regimes during the backward generative diffusion process.
The dependence of the collapse time on the dimension and number of data provides a thorough characterization of the curse of dimensionality for diffusion models.
- Score: 14.797301819675454
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Using statistical physics methods, we study generative diffusion models in
the regime where the dimension of space and the number of data are large, and
the score function has been trained optimally. Our analysis reveals three
distinct dynamical regimes during the backward generative diffusion process.
The generative dynamics, starting from pure noise, encounters first a
'speciation' transition where the gross structure of data is unraveled, through
a mechanism similar to symmetry breaking in phase transitions. It is followed
at later time by a 'collapse' transition where the trajectories of the dynamics
become attracted to one of the memorized data points, through a mechanism which
is similar to the condensation in a glass phase. For any dataset, the
speciation time can be found from a spectral analysis of the correlation
matrix, and the collapse time can be found from the estimation of an 'excess
entropy' in the data. The dependence of the collapse time on the dimension and
number of data provides a thorough characterization of the curse of
dimensionality for diffusion models. Analytical solutions for simple models
like high-dimensional Gaussian mixtures substantiate these findings and provide
a theoretical framework, while extensions to more complex scenarios and
numerical validations with real datasets confirm the theoretical predictions.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Latent diffusion models for parameterization and data assimilation of facies-based geomodels [0.0]
Diffusion models are trained to generate new geological realizations from input fields characterized by random noise.
Latent diffusion models are shown to provide realizations that are visually consistent with samples from geomodeling software.
arXiv Detail & Related papers (2024-06-21T01:32:03Z) - Diffusion posterior sampling for simulation-based inference in tall data settings [53.17563688225137]
Simulation-based inference ( SBI) is capable of approximating the posterior distribution that relates input parameters to a given observation.
In this work, we consider a tall data extension in which multiple observations are available to better infer the parameters of the model.
We compare our method to recently proposed competing approaches on various numerical experiments and demonstrate its superiority in terms of numerical stability and computational cost.
arXiv Detail & Related papers (2024-04-11T09:23:36Z) - Synthetic location trajectory generation using categorical diffusion
models [50.809683239937584]
Diffusion models (DPMs) have rapidly evolved to be one of the predominant generative models for the simulation of synthetic data.
We propose using DPMs for the generation of synthetic individual location trajectories (ILTs) which are sequences of variables representing physical locations visited by individuals.
arXiv Detail & Related papers (2024-02-19T15:57:39Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Data-driven reduced-order modelling for blood flow simulations with
geometry-informed snapshots [0.0]
A data-driven surrogate model is proposed for the efficient prediction of blood flow simulations on similar but distinct domains.
A non-intrusive reduced-order model for geometrical parameters is constructed using proper decomposition.
A radial basis function interpolator is trained for predicting the reduced coefficients of the reduced-order model.
arXiv Detail & Related papers (2023-02-21T21:18:17Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Factorized Fusion Shrinkage for Dynamic Relational Data [16.531262817315696]
We consider a factorized fusion shrinkage model in which all decomposed factors are dynamically shrunk towards group-wise fusion structures.
The proposed priors enjoy many favorable properties in comparison and clustering of the estimated dynamic latent factors.
We present a structured mean-field variational inference framework that balances optimal posterior inference with computational scalability.
arXiv Detail & Related papers (2022-09-30T21:03:40Z) - Stochastic embeddings of dynamical phenomena through variational
autoencoders [1.7205106391379026]
We use a recognition network to increase the observed space dimensionality during the reconstruction of the phase space.
Our validation shows that this approach not only recovers a state space that resembles the original one, but it is also able to synthetize new time series.
arXiv Detail & Related papers (2020-10-13T10:10:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.