Related papers: Learning a Restricted Boltzmann Machine using biased Monte Carlo sampling

Learning a Restricted Boltzmann Machine using biased Monte Carlo sampling

URL: http://arxiv.org/abs/2206.01310v1
Date: Thu, 2 Jun 2022 21:29:01 GMT
Title: Learning a Restricted Boltzmann Machine using biased Monte Carlo sampling
Authors: Nicolas B\'ereux, Aur\'elien Decelle, Cyril Furtlehner, Beatriz Seoane
Abstract summary: We show that sampling the equilibrium distribution via Markov Chain Monte Carlo can be dramatically accelerated using biased sampling techniques. We also show that this sampling technique can be exploited to improve the computation of the log-likelihood gradient during the training too.
Score: 0.6554326244334867
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Restricted Boltzmann Machines are simple and powerful generative models capable of encoding any complex dataset. Despite all their advantages, in practice, trainings are often unstable, and it is hard to assess their quality because dynamics are hampered by extremely slow time-dependencies. This situation becomes critical when dealing with low-dimensional clustered datasets, where the time needed to sample ergodically the trained models becomes computationally prohibitive. In this work, we show that this divergence of Monte Carlo mixing times is related to a phase coexistence phenomenon, similar to that encountered in Physics in the vicinity of a first order phase transition. We show that sampling the equilibrium distribution via Markov Chain Monte Carlo can be dramatically accelerated using biased sampling techniques, in particular, the Tethered Monte Carlo method (TMC). This sampling technique solves efficiently the problem of evaluating the quality of a given trained model and the generation of new samples in reasonable times. In addition, we show that this sampling technique can be exploited to improve the computation of the log-likelihood gradient during the training too, which produces dramatic improvements when training RBMs with artificial clustered datasets. When dealing with real low-dimensional datasets, this new training procedure fits RBM models with significantly faster relaxational dynamics than those obtained with standard PCD recipes. We also show that TMC sampling can be used to recover free-energy profile of the RBM, which turns out to be extremely useful to compute the probability distribution of a given model and to improve the generation of new decorrelated samples on slow PCD trained models.

Related papers

Learning with Restricted Boltzmann Machines: Asymptotics of AMP and GD in High Dimensions [31.75902683077129]
The Restricted Boltzmann Machine (RBM) is one of the simplest generative neural networks capable of learning input distributions.<n>We simplify the standard RBM training objective into a form that is equivalent to the multi-index model with non-separable regularization.<n>We show in particular that RBM reaches the optimal computational weak recovery threshold, aligning with the BBP transition.
arXiv Detail & Related papers (2025-05-23T15:51:46Z)
Mitigating mode collapse in normalizing flows by annealing with an adaptive schedule: Application to parameter estimation [0.6258471240250307]
We show that an adaptive schedule based on the effective sample size (ESS) can mitigate mode collapse.<n>We demonstrate that our approach can converge the marginal likelihood for a biochemical oscillator model fit to time-series data in ten-fold less time than a widely used ensemble Markov chain Monte Carlo method.
arXiv Detail & Related papers (2025-05-06T15:58:48Z)
Fast training and sampling of Restricted Boltzmann Machines [4.785158987724452]
We build upon recent theoretical advancements in RBM training, to significantly reduce the computational cost of training. We propose a pre-training phase that encodes the principal components into a low-rank RBM through a convex optimization process. We exploit the continuous and smooth nature of the parameter annealing trajectory to achieve reliable and computationally efficient log-likelihood estimations.
arXiv Detail & Related papers (2024-05-24T09:23:43Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Micro-Macro Consistency in Multiscale Modeling: Score-Based Model Assisted Sampling of Fast/Slow Dynamical Systems [0.0]
In the study of physics-based multi-time-scale dynamical systems, techniques have been developed for enhancing sampling. In the field of Machine Learning, a generic goal of generative models is to sample from a target density, after training on empirical samples from this density. In this work, we show that that SGMs can be used in such a coupling framework to improve sampling in multiscale dynamical systems.
arXiv Detail & Related papers (2023-12-10T00:46:37Z)
Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress. We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z)
Balanced Training of Energy-Based Models with Adaptive Flow Sampling [13.951904929884618]
Energy-based models (EBMs) are versatile density estimation models that directly parameterize an unnormalized log density. We propose a new maximum likelihood training algorithm for EBMs that uses a different type of generative model, normalizing flows (NF) Our method fits an NF to an EBM during training so that an NF-assisted sampling scheme provides an accurate gradient for the EBMs at all times.
arXiv Detail & Related papers (2023-06-01T13:58:06Z)
Stabilizing Machine Learning Prediction of Dynamics: Noise and Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems. In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability. We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z)
Low-variance estimation in the Plackett-Luce model via quasi-Monte Carlo sampling [58.14878401145309]
We develop a novel approach to producing more sample-efficient estimators of expectations in the PL model. We illustrate our findings both theoretically and empirically using real-world recommendation data from Amazon Music and the Yahoo learning-to-rank challenge.
arXiv Detail & Related papers (2022-05-12T11:15:47Z)
Distributionally Robust Models with Parametric Likelihood Ratios [123.05074253513935]
Three simple ideas allow us to train models with DRO using a broader class of parametric likelihood ratios. We find that models trained with the resulting parametric adversaries are consistently more robust to subpopulation shifts when compared to other DRO approaches.
arXiv Detail & Related papers (2022-04-13T12:43:12Z)
Learning Energy-Based Model with Variational Auto-Encoder as Amortized Sampler [35.80109055748496]
Training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo sampling. We learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function. With these amortized MCMC samples, the EBM can be trained by maximum likelihood, which follows an "analysis by synthesis" scheme. We call this joint training algorithm the variational MCMC teaching, in which the VAE chases the EBM toward data distribution.
arXiv Detail & Related papers (2020-12-29T20:46:40Z)
No MCMC for me: Amortized sampling for fast and stable training of energy-based models [62.1234885852552]
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty. We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling. Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
arXiv Detail & Related papers (2020-10-08T19:17:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.