Guiding Energy-based Models via Contrastive Latent Variables
- URL: http://arxiv.org/abs/2303.03023v1
- Date: Mon, 6 Mar 2023 10:50:25 GMT
- Title: Guiding Energy-based Models via Contrastive Latent Variables
- Authors: Hankook Lee, Jongheon Jeong, Sejun Park, Jinwoo Shin
- Abstract summary: An energy-based model (EBM) is a popular generative framework that offers both explicit density and architectural flexibility.
There often exists a large gap between EBMs and other generative frameworks like GANs in terms of generation quality.
We propose a novel and effective framework for improving EBMs via contrastive representation learning.
- Score: 81.68492940158436
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: An energy-based model (EBM) is a popular generative framework that offers
both explicit density and architectural flexibility, but training them is
difficult since it is often unstable and time-consuming. In recent years,
various training techniques have been developed, e.g., better divergence
measures or stabilization in MCMC sampling, but there often exists a large gap
between EBMs and other generative frameworks like GANs in terms of generation
quality. In this paper, we propose a novel and effective framework for
improving EBMs via contrastive representation learning (CRL). To be specific,
we consider representations learned by contrastive methods as the true
underlying latent variable. This contrastive latent variable could guide EBMs
to understand the data structure better, so it can improve and accelerate EBM
training significantly. To enable the joint training of EBM and CRL, we also
design a new class of latent-variable EBMs for learning the joint density of
data and the contrastive latent variable. Our experimental results demonstrate
that our scheme achieves lower FID scores, compared to prior-art EBM methods
(e.g., additionally using variational autoencoders or diffusion techniques),
even with significantly faster and more memory-efficient training. We also show
conditional and compositional generation abilities of our latent-variable EBMs
as their additional benefits, even without explicit conditional training. The
code is available at https://github.com/hankook/CLEL.
Related papers
- Improving Adversarial Energy-Based Model via Diffusion Process [25.023967485839155]
Adversarial EBMs introduce a generator to form a minimax training game.
Inspired by diffusion-based models, we embedded EBMs into each denoising step to split a long-generated process into several smaller steps.
Our experiments show significant improvement in generation compared to existing adversarial EBMs.
arXiv Detail & Related papers (2024-03-04T01:33:53Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - Learning Energy-Based Models by Cooperative Diffusion Recovery Likelihood [64.95663299945171]
Training energy-based models (EBMs) on high-dimensional data can be both challenging and time-consuming.
There exists a noticeable gap in sample quality between EBMs and other generative frameworks like GANs and diffusion models.
We propose cooperative diffusion recovery likelihood (CDRL), an effective approach to tractably learn and sample from a series of EBMs.
arXiv Detail & Related papers (2023-09-10T22:05:24Z) - Latent Diffusion Energy-Based Model for Interpretable Text Modeling [104.85356157724372]
We introduce a novel symbiosis between the diffusion models and latent space EBMs in a variational learning framework.
We develop a geometric clustering-based regularization jointly with the information bottleneck to further improve the quality of the learned latent space.
arXiv Detail & Related papers (2022-06-13T03:41:31Z) - Bounds all around: training energy-based models with bidirectional
bounds [26.507268387712145]
Energy-based models (EBMs) provide an elegant framework for density estimation, but they are notoriously difficult to train.
Recent work has established links to generative adversarial networks, where the EBM is trained through a minimax game with a variational value function.
We propose a bidirectional bound on the EBM log-likelihood, such that we maximize a lower bound and minimize an upper bound when solving the minimax game.
arXiv Detail & Related papers (2021-11-01T13:25:38Z) - How to Train Your Energy-Based Models [19.65375049263317]
Energy-Based Models (EBMs) specify probability density or mass functions up to an unknown normalizing constant.
This tutorial is targeted at an audience with basic understanding of generative models who want to apply EBMs or start a research project in this direction.
arXiv Detail & Related papers (2021-01-09T04:51:31Z) - Energy-Based Models for Continual Learning [36.05297743063411]
We motivate Energy-Based Models (EBMs) as a promising model class for continual learning problems.
Our proposed version of EBMs for continual learning is simple, efficient and outperforms baseline methods by a large margin on several benchmarks.
arXiv Detail & Related papers (2020-11-24T17:08:13Z) - No MCMC for me: Amortized sampling for fast and stable training of
energy-based models [62.1234885852552]
Energy-Based Models (EBMs) present a flexible and appealing way to represent uncertainty.
We present a simple method for training EBMs at scale using an entropy-regularized generator to amortize the MCMC sampling.
Next, we apply our estimator to the recently proposed Joint Energy Model (JEM), where we match the original performance with faster and stable training.
arXiv Detail & Related papers (2020-10-08T19:17:20Z) - MCMC Should Mix: Learning Energy-Based Model with Neural Transport
Latent Space MCMC [110.02001052791353]
Learning energy-based model (EBM) requires MCMC sampling of the learned model as an inner loop of the learning algorithm.
We show that the model has a particularly simple form in the space of the latent variables of the backbone model.
arXiv Detail & Related papers (2020-06-12T01:25:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.