Cascade of phase transitions in the training of Energy-based models
- URL: http://arxiv.org/abs/2405.14689v2
- Date: Wed, 29 May 2024 08:18:03 GMT
- Title: Cascade of phase transitions in the training of Energy-based models
- Authors: Dimitrios Bachtis, Giulio Biroli, Aurélien Decelle, Beatriz Seoane,
- Abstract summary: We investigate the feature encoding process in a prototypical energy-based generative model, the Bernoulli-Bernoulli RBM.
Our study tracks the evolution of the model's weight matrix through its singular value decomposition.
We validate our theoretical results by training the Bernoulli-Bernoulli RBM on real data sets.
- Score: 9.945465034701288
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we investigate the feature encoding process in a prototypical energy-based generative model, the Restricted Boltzmann Machine (RBM). We start with an analytical investigation using simplified architectures and data structures, and end with numerical analysis of real trainings on real datasets. Our study tracks the evolution of the model's weight matrix through its singular value decomposition, revealing a series of phase transitions associated to a progressive learning of the principal modes of the empirical probability distribution. The model first learns the center of mass of the modes and then progressively resolve all modes through a cascade of phase transitions. We first describe this process analytically in a controlled setup that allows us to study analytically the training dynamics. We then validate our theoretical results by training the Bernoulli-Bernoulli RBM on real data sets. By using data sets of increasing dimension, we show that learning indeed leads to sharp phase transitions in the high-dimensional limit. Moreover, we propose and test a mean-field finite-size scaling hypothesis. This shows that the first phase transition is in the same universality class of the one we studied analytically, and which is reminiscent of the mean-field paramagnetic-to-ferromagnetic phase transition.
Related papers
- Dynamical Regimes of Diffusion Models [14.797301819675454]
We study generative diffusion models in the regime where the dimension of space and the number of data are large.
Our analysis reveals three distinct dynamical regimes during the backward generative diffusion process.
The dependence of the collapse time on the dimension and number of data provides a thorough characterization of the curse of dimensionality for diffusion models.
arXiv Detail & Related papers (2024-02-28T17:19:26Z) - Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - Explaining the Machine Learning Solution of the Ising Model [0.0]
This work shows how it can be accomplished for the ferromagnetic Ising model, the main target of several machine learning (ML) studies in statistical physics.
By using a neural network (NN) without hidden layers (the simplest possible) and informed by the symmetry of the Hamiltonian, an explanation is provided for the strategy used in finding the supervised learning solution.
These results pave the way to a physics-informed explainable generalized framework, enabling the extraction of physical laws and principles from the parameters of the models.
arXiv Detail & Related papers (2024-02-18T20:47:33Z) - Machine Learning for the identification of phase-transitions in interacting agent-based systems: a Desai-Zwanzig example [0.0]
We propose a data-driven framework that pinpoints phase transitions for an agent-based model in its mean-field limit.
To this end, we use the manifold learning algorithm Maps to identify a parsimonious set of data-driven latent variables.
We then utilize a deep learning framework to obtain a conformal reparametrization of the data-driven coordinates.
arXiv Detail & Related papers (2023-10-29T15:07:08Z) - Generative Modeling with Phase Stochastic Bridges [49.4474628881673]
Diffusion models (DMs) represent state-of-the-art generative models for continuous inputs.
We introduce a novel generative modeling framework grounded in textbfphase space dynamics
Our framework demonstrates the capability to generate realistic data points at an early stage of dynamics propagation.
arXiv Detail & Related papers (2023-10-11T18:38:28Z) - From Stability to Chaos: Analyzing Gradient Descent Dynamics in
Quadratic Regression [14.521929085104441]
We investigate the dynamics of gradient descent using large-order constant step-sizes in the context of quadratic regression models.
We delineate five distinct training phases: (1) monotonic, (2) catapult, (3) periodic, (4) chaotic, and (5) divergent.
In particular, we observe that performing an ergodic trajectory averaging stabilizes the test error in non-monotonic (and non-divergent) phases.
arXiv Detail & Related papers (2023-10-02T22:59:17Z) - Capturing dynamical correlations using implicit neural representations [85.66456606776552]
We develop an artificial intelligence framework which combines a neural network trained to mimic simulated data from a model Hamiltonian with automatic differentiation to recover unknown parameters from experimental data.
In doing so, we illustrate the ability to build and train a differentiable model only once, which then can be applied in real-time to multi-dimensional scattering data.
arXiv Detail & Related papers (2023-04-08T07:55:36Z) - Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method.
A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations.
We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z) - Unsupervised machine learning of topological phase transitions from
experimental data [52.77024349608834]
We apply unsupervised machine learning techniques to experimental data from ultracold atoms.
We obtain the topological phase diagram of the Haldane model in a completely unbiased fashion.
Our work provides a benchmark for unsupervised detection of new exotic phases in complex many-body systems.
arXiv Detail & Related papers (2021-01-14T16:38:21Z) - Stochastic embeddings of dynamical phenomena through variational
autoencoders [1.7205106391379026]
We use a recognition network to increase the observed space dimensionality during the reconstruction of the phase space.
Our validation shows that this approach not only recovers a state space that resembles the original one, but it is also able to synthetize new time series.
arXiv Detail & Related papers (2020-10-13T10:10:24Z) - Kernel and Rich Regimes in Overparametrized Models [69.40899443842443]
We show that gradient descent on overparametrized multilayer networks can induce rich implicit biases that are not RKHS norms.
We also demonstrate this transition empirically for more complex matrix factorization models and multilayer non-linear networks.
arXiv Detail & Related papers (2020-02-20T15:43:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.