Learning a Restricted Boltzmann Machine using biased Monte Carlo
sampling
- URL: http://arxiv.org/abs/2206.01310v1
- Date: Thu, 2 Jun 2022 21:29:01 GMT
- Title: Learning a Restricted Boltzmann Machine using biased Monte Carlo
sampling
- Authors: Nicolas B\'ereux, Aur\'elien Decelle, Cyril Furtlehner, Beatriz Seoane
- Abstract summary: We show that sampling the equilibrium distribution via Markov Chain Monte Carlo can be dramatically accelerated using biased sampling techniques.
We also show that this sampling technique can be exploited to improve the computation of the log-likelihood gradient during the training too.
- Score: 0.6554326244334867
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Restricted Boltzmann Machines are simple and powerful generative models
capable of encoding any complex dataset. Despite all their advantages, in
practice, trainings are often unstable, and it is hard to assess their quality
because dynamics are hampered by extremely slow time-dependencies. This
situation becomes critical when dealing with low-dimensional clustered
datasets, where the time needed to sample ergodically the trained models
becomes computationally prohibitive. In this work, we show that this divergence
of Monte Carlo mixing times is related to a phase coexistence phenomenon,
similar to that encountered in Physics in the vicinity of a first order phase
transition. We show that sampling the equilibrium distribution via Markov Chain
Monte Carlo can be dramatically accelerated using biased sampling techniques,
in particular, the Tethered Monte Carlo method (TMC). This sampling technique
solves efficiently the problem of evaluating the quality of a given trained
model and the generation of new samples in reasonable times. In addition, we
show that this sampling technique can be exploited to improve the computation
of the log-likelihood gradient during the training too, which produces dramatic
improvements when training RBMs with artificial clustered datasets. When
dealing with real low-dimensional datasets, this new training procedure fits
RBM models with significantly faster relaxational dynamics than those obtained
with standard PCD recipes. We also show that TMC sampling can be used to
recover free-energy profile of the RBM, which turns out to be extremely useful
to compute the probability distribution of a given model and to improve the
generation of new decorrelated samples on slow PCD trained models.
Related papers
- Fast training and sampling of Restricted Boltzmann Machines [4.785158987724452]
We build upon recent theoretical advancements in RBM training, to significantly reduce the computational cost of training.
We propose a pre-training phase that encodes the principal components into a low-rank RBM through a convex optimization process.
We exploit the continuous and smooth nature of the parameter annealing trajectory to achieve reliable and computationally efficient log-likelihood estimations.
arXiv Detail & Related papers (2024-05-24T09:23:43Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Micro-Macro Consistency in Multiscale Modeling: Score-Based Model
Assisted Sampling of Fast/Slow Dynamical Systems [0.0]
In the study of physics-based multi-time-scale dynamical systems, techniques have been developed for enhancing sampling.
In the field of Machine Learning, a generic goal of generative models is to sample from a target density, after training on empirical samples from this density.
In this work, we show that that SGMs can be used in such a coupling framework to improve sampling in multiscale dynamical systems.
arXiv Detail & Related papers (2023-12-10T00:46:37Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - Balanced Training of Energy-Based Models with Adaptive Flow Sampling [13.951904929884618]
Energy-based models (EBMs) are versatile density estimation models that directly parameterize an unnormalized log density.
We propose a new maximum likelihood training algorithm for EBMs that uses a different type of generative model, normalizing flows (NF)
Our method fits an NF to an EBM during training so that an NF-assisted sampling scheme provides an accurate gradient for the EBMs at all times.
arXiv Detail & Related papers (2023-06-01T13:58:06Z) - Stabilizing Machine Learning Prediction of Dynamics: Noise and
Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems.
In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability.
We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z) - Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo
sampling [58.14878401145309]
We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model.
We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.
arXiv Detail & Related papers (2022-05-12T11:15:47Z) - Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios.
We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z) - Learning Energy-Based Model with Variational Auto-Encoder as Amortized
Sampler [35.80109055748496]
Training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo sampling.
We learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function.
With these amortized MCMC samples, the EBM can be trained by maximum likelihood, which follows an "analysis by synthesis" scheme.
We call this joint training algorithm the variational MCMC teaching, in which the VAE chases the EBM toward data distribution.
arXiv Detail & Related papers (2020-12-29T20:46:40Z) - No MCMC for me: Amortized sampling for fast and stable training of
energy-based models [62.1234885852552]
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.
We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling.
Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
arXiv Detail & Related papers (2020-10-08T19:17:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.