No MCMC for me: Amortized sampling for fast and stable training of
energy-based models
- URL: http://arxiv.org/abs/2010.04230v3
- Date: Sun, 6 Jun 2021 20:40:14 GMT
- Title: No MCMC for me: Amortized sampling for fast and stable training of
energy-based models
- Authors: Will Grathwohl, Jacob Kelly, Milad Hashemi, Mohammad Norouzi, Kevin
Swersky, David Duvenaud
- Abstract summary: Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.
We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling.
Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
- Score: 62.1234885852552
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Energy-Based Models (EBMs) present a flexible and appealing way to represent
uncertainty. Despite recent advances, training EBMs on high-dimensional data
remains a challenging problem as the state-of-the-art approaches are costly,
unstable, and require considerable tuning and domain expertise to apply
successfully. In this work, we present a simple method for training EBMs at
scale which uses an entropy-regularized generator to amortize the MCMC sampling
typically used in EBM training. We improve upon prior MCMC-based entropy
regularization methods with a fast variational approximation. We demonstrate
the effectiveness of our approach by using it to train tractable likelihood
models. Next, we apply our estimator to the recently proposed Joint Energy
Model (JEM), where we match the original performance with faster and stable
training. This allows us to extend JEM models to semi-supervised classification
on tabular data from a variety of continuous domains.
Related papers
- Improving Adversarial Energy-Based Model via Diffusion Process [25.023967485839155]
Adversarial EBMs introduce a generator to form a minimax training game.
Inspired by diffusion-based models, we embedded EBMs into each denoising step to split a long-generated process into several smaller steps.
Our experiments show significant improvement in generation compared to existing adversarial EBMs.
arXiv Detail & Related papers (2024-03-04T01:33:53Z) - STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning
Energy-Based Models [41.031470884141775]
We present an end-to-end learning algorithm for Energy-Based models (EBM)
We propose in this paper, a novel high dimensional sampling method, based on an anisotropic stepsize and a gradient-informed covariance matrix.
Our resulting method, namely STANLEY, is an optimization algorithm for training Energy-Based models via our newly introduced MCMC method.
arXiv Detail & Related papers (2023-10-19T11:55:16Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood [64.95663299945171]
Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming.
There exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models.
We propose cooperative diffusion recovery likelihood (CDRL), an effective approach to tractably learn and sample from a series of EBMs.
arXiv Detail & Related papers (2023-09-10T22:05:24Z) - Balanced Training of Energy-Based Models with Adaptive Flow Sampling [13.951904929884618]
Energy-based models (EBMs) are versatile density estimation models that directly parameterize an unnormalized log density.
We propose a new maximum likelihood training algorithm for EBMs that uses a different type of generative model, normalizing flows (NF)
Our method fits an NF to an EBM during training so that an NF-assisted sampling scheme provides an accurate gradient for the EBMs at all times.
arXiv Detail & Related papers (2023-06-01T13:58:06Z) - Guiding Energy-based Models via Contrastive Latent Variables [81.68492940158436]
An energy-based model (EBM) is a popular generative framework that offers both explicit density and architectural flexibility.
There often exists a large gap between EBMs and other generative frameworks like GANs in terms of generation quality.
We propose a novel and effective framework for improving EBMs via contrastive representation learning.
arXiv Detail & Related papers (2023-03-06T10:50:25Z) - Your Autoregressive Generative Model Can be Better If You Treat It as an
Energy-Based One [83.5162421521224]
We propose a unique method termed E-ARM for training autoregressive generative models.
E-ARM takes advantage of a well-designed energy-based learning objective.
We show that E-ARM can be trained efficiently and is capable of alleviating the exposure bias problem.
arXiv Detail & Related papers (2022-06-26T10:58:41Z) - How to Train Your Energy-Based Models [19.65375049263317]
Energy-Based Models (EBMs) specify probability density or mass functions up to an unknown normalizing constant.
This tutorial is targeted at an audience with basic understanding of generative models who want to apply EBMs or start a research project in this direction.
arXiv Detail & Related papers (2021-01-09T04:51:31Z) - MCMC Should Mix: Learning Energy-Based Model with Neural Transport
Latent Space MCMC [110.02001052791353]
Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm.
We show that the model has a particularly simple form in the space of the latent variables of the backbone model.
arXiv Detail & Related papers (2020-06-12T01:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.