Related papers: Explaining the effects of non-convergent sampling in the training of Energy-Based Models

Explaining the effects of non-convergent sampling in the training of Energy-Based Models

URL: http://arxiv.org/abs/2301.09428v2
Date: Wed, 31 May 2023 15:38:32 GMT
Title: Explaining the effects of non-convergent sampling in the training of Energy-Based Models
Authors: Elisabeth Agoritsas, Giovanni Catania, Aur\'elien Decelle, Beatriz Seoane
Abstract summary: We quantify the impact of using non-convergent Markov chains to train Energy-Based models. We show that EBMs trained with non-persistent short runs to estimate the gradient can perfectly reproduce a set of empirical statistics.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: In this paper, we quantify the impact of using non-convergent Markov chains to train Energy-Based models (EBMs). In particular, we show analytically that EBMs trained with non-persistent short runs to estimate the gradient can perfectly reproduce a set of empirical statistics of the data, not at the level of the equilibrium measure, but through a precise dynamical process. Our results provide a first-principles explanation for the observations of recent works proposing the strategy of using short runs starting from random initial conditions as an efficient way to generate high-quality samples in EBMs, and lay the groundwork for using EBMs as diffusion models. After explaining this effect in generic EBMs, we analyze two solvable models in which the effect of the non-convergent sampling in the trained parameters can be described in detail. Finally, we test these predictions numerically on a ConvNet EBM and a Boltzmann machine.

Related papers

Analytic Energy-Guided Policy Optimization for Offline Reinforcement Learning [54.07840818762834]
Conditional decision generation with diffusion models has shown powerful competitiveness in reinforcement learning (RL)<n>Recent studies reveal the relation between energy-function-guidance diffusion models and constrained RL problems.<n>Main challenge lies in estimating the intermediate energy, which is intractable due to the log-expectation formulation during the generation process.
arXiv Detail & Related papers (2025-05-03T14:00:25Z)
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities [109.23137009609519]
Iterated Denoising Energy Matching (iDEM) iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our matching objective. We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5times$ faster.
arXiv Detail & Related papers (2024-02-09T01:11:23Z)
STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning Energy-Based Models [41.031470884141775]
We present an end-to-end learning algorithm for Energy-Based models (EBM) We propose in this paper, a novel high dimensional sampling method, based on an anisotropic stepsize and a gradient-informed covariance matrix. Our resulting method, namely STANLEY, is an optimization algorithm for training Energy-Based models via our newly introduced MCMC method.
arXiv Detail & Related papers (2023-10-19T11:55:16Z)
Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress. We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z)
Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood [64.95663299945171]
Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming. There exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models. We propose cooperative diffusion recovery likelihood (CDRL), an effective approach to tractably learn and sample from a series of EBMs.
arXiv Detail & Related papers (2023-09-10T22:05:24Z)
MCMC-Correction of Score-Based Diffusion Models for Model Composition [2.682859657520006]
Diffusion models can be parameterized in terms of a score or an energy function.<n>We introduce a novel MH-like acceptance rule based on line integration of the score function.
arXiv Detail & Related papers (2023-07-26T07:50:41Z)
Balanced Training of Energy-Based Models with Adaptive Flow Sampling [13.951904929884618]
Energy-based models (EBMs) are versatile density estimation models that directly parameterize an unnormalized log density. We propose a new maximum likelihood training algorithm for EBMs that uses a different type of generative model, normalizing flows (NF) Our method fits an NF to an EBM during training so that an NF-assisted sampling scheme provides an accurate gradient for the EBMs at all times.
arXiv Detail & Related papers (2023-06-01T13:58:06Z)
Efficient Training of Energy-Based Models Using Jarzynski Equality [13.636994997309307]
Energy-based models (EBMs) are generative models inspired by statistical physics. The computation of its gradient with respect to the model parameters requires sampling the model distribution. Here we show how results for nonequilibrium thermodynamics based on Jarzynski equality can be used to perform this computation efficiently.
arXiv Detail & Related papers (2023-05-30T21:07:52Z)
Particle Dynamics for Learning EBMs [83.59335980576637]
Energy-based modeling is a promising approach to unsupervised learning, which yields many downstream applications from a single model. The main difficulty in learning energy-based models with the "contrastive approaches" is the generation of samples from the current energy function at each iteration. This paper proposes an alternative approach to getting these samples and avoiding crude MCMC sampling from the current model.
arXiv Detail & Related papers (2021-11-26T23:41:07Z)
Prediction of liquid fuel properties using machine learning models with Gaussian processes and probabilistic conditional generative learning [56.67751936864119]
The present work aims to construct cheap-to-compute machine learning (ML) models to act as closure equations for predicting the physical properties of alternative fuels. Those models can be trained using the database from MD simulations and/or experimental measurements in a data-fusion-fidelity approach. The results show that ML models can predict accurately the fuel properties of a wide range of pressure and temperature conditions.
arXiv Detail & Related papers (2021-10-18T14:43:50Z)
No MCMC for me: Amortized sampling for fast and stable training of energy-based models [62.1234885852552]
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty. We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling. Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
arXiv Detail & Related papers (2020-10-08T19:17:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.