MCMC Should Mix: Learning Energy-Based Model with Neural Transport
Latent Space MCMC
- URL: http://arxiv.org/abs/2006.06897v2
- Date: Wed, 16 Mar 2022 07:53:07 GMT
- Title: MCMC Should Mix: Learning Energy-Based Model with Neural Transport
Latent Space MCMC
- Authors: Erik Nijkamp, Ruiqi Gao, Pavel Sountsov, Srinivas Vasudevan, Bo Pang,
Song-Chun Zhu, Ying Nian Wu
- Abstract summary: Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm.
We show that the model has a particularly simple form in the space of the latent variables of the backbone model.
- Score: 110.02001052791353
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning energy-based model (EBM) requires MCMC sampling of the learned model
as an inner loop of the learning algorithm. However, MCMC sampling of EBMs in
high-dimensional data space is generally not mixing, because the energy
function, which is usually parametrized by a deep network, is highly
multi-modal in the data space. This is a serious handicap for both theory and
practice of EBMs. In this paper, we propose to learn an EBM with a flow-based
model (or in general a latent variable model) serving as a backbone, so that
the EBM is a correction or an exponential tilting of the flow-based model. We
show that the model has a particularly simple form in the space of the latent
variables of the backbone model, and MCMC sampling of the EBM in the latent
space mixes well and traverses modes in the data space. This enables proper
sampling and learning of EBMs.
Related papers
- Learning Energy-based Model via Dual-MCMC Teaching [5.31573596283377]
Learning the energy-based model (EBM) can be achieved using the maximum likelihood estimation (MLE)
This paper studies the fundamental learning problem of the energy-based model (EBM)
arXiv Detail & Related papers (2023-12-05T03:39:54Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - Guiding Energy-based Models via Contrastive Latent Variables [81.68492940158436]
An energy-based model (EBM) is a popular generative framework that offers both explicit density and architectural flexibility.
There often exists a large gap between EBMs and other generative frameworks like GANs in terms of generation quality.
We propose a novel and effective framework for improving EBMs via contrastive representation learning.
arXiv Detail & Related papers (2023-03-06T10:50:25Z) - Mitigating Out-of-Distribution Data Density Overestimation in
Energy-Based Models [54.06799491319278]
Deep energy-based models (EBMs) are receiving increasing attention due to their ability to learn complex distributions.
To train deep EBMs, the maximum likelihood estimation (MLE) with short-run Langevin Monte Carlo (LMC) is often used.
We investigate why the MLE with short-run LMC can converge to EBMs with wrong density estimates.
arXiv Detail & Related papers (2022-05-30T02:49:17Z) - Learning Energy-Based Model with Variational Auto-Encoder as Amortized
Sampler [35.80109055748496]
Training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo sampling.
We learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function.
With these amortized MCMC samples, the EBM can be trained by maximum likelihood, which follows an "analysis by synthesis" scheme.
We call this joint training algorithm the variational MCMC teaching, in which the VAE chases the EBM toward data distribution.
arXiv Detail & Related papers (2020-12-29T20:46:40Z) - No MCMC for me: Amortized sampling for fast and stable training of
energy-based models [62.1234885852552]
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.
We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling.
Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
arXiv Detail & Related papers (2020-10-08T19:17:20Z) - Learning Latent Space Energy-Based Prior Model [118.86447805707094]
We learn energy-based model (EBM) in the latent space of a generator model.
We show that the learned model exhibits strong performances in terms of image and text generation and anomaly detection.
arXiv Detail & Related papers (2020-06-15T08:11:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.