Particle Dynamics for Learning EBMs
- URL: http://arxiv.org/abs/2111.13772v1
- Date: Fri, 26 Nov 2021 23:41:07 GMT
- Title: Particle Dynamics for Learning EBMs
- Authors: Kirill Neklyudov, Priyank Jaini, Max Welling
- Abstract summary: Energy-based modeling is a promising approach to unsupervised learning, which yields many downstream applications from a single model.
The main difficulty in learning energy-based models with the "contrastive approaches" is the generation of samples from the current energy function at each iteration.
This paper proposes an alternative approach to getting these samples and avoiding crude MCMC sampling from the current model.
- Score: 83.59335980576637
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Energy-based modeling is a promising approach to unsupervised learning, which
yields many downstream applications from a single model. The main difficulty in
learning energy-based models with the "contrastive approaches" is the
generation of samples from the current energy function at each iteration. Many
advances have been made to accomplish this subroutine cheaply. Nevertheless,
all such sampling paradigms run MCMC targeting the current model, which
requires infinitely long chains to generate samples from the true energy
distribution and is problematic in practice. This paper proposes an alternative
approach to getting these samples and avoiding crude MCMC sampling from the
current model. We accomplish this by viewing the evolution of the modeling
distribution as (i) the evolution of the energy function, and (ii) the
evolution of the samples from this distribution along some vector field. We
subsequently derive this time-dependent vector field such that the particles
following this field are approximately distributed as the current density
model. Thereby we match the evolution of the particles with the evolution of
the energy function prescribed by the learning procedure. Importantly, unlike
Monte Carlo sampling, our method targets to match the current distribution in a
finite time. Finally, we demonstrate its effectiveness empirically compared to
MCMC-based learning methods.
Related papers
- Conditional Synthesis of 3D Molecules with Time Correction Sampler [58.0834973489875]
Time-Aware Conditional Synthesis (TACS) is a novel approach to conditional generation on diffusion models.
It integrates adaptively controlled plug-and-play "online" guidance into a diffusion model, driving samples toward the desired properties.
arXiv Detail & Related papers (2024-11-01T12:59:25Z) - Energy-Based Diffusion Language Models for Text Generation [126.23425882687195]
Energy-based Diffusion Language Model (EDLM) is an energy-based model operating at the full sequence level for each diffusion step.
Our framework offers a 1.3$times$ sampling speedup over existing diffusion models.
arXiv Detail & Related papers (2024-10-28T17:25:56Z) - Iterated Denoising Energy Matching for Sampling from Boltzmann Densities [109.23137009609519]
Iterated Denoising Energy Matching (iDEM)
iDEM alternates between (I) sampling regions of high model density from a diffusion-based sampler and (II) using these samples in our matching objective.
We show that the proposed approach achieves state-of-the-art performance on all metrics and trains $2-5times$ faster.
arXiv Detail & Related papers (2024-02-09T01:11:23Z) - Generalized Contrastive Divergence: Joint Training of Energy-Based Model
and Diffusion Model through Inverse Reinforcement Learning [13.22531381403974]
Generalized Contrastive Divergence (GCD) is a novel objective function for training an energy-based model (EBM) and a sampler simultaneously.
We present preliminary yet promising results showing that joint training is beneficial for both EBM and a diffusion model.
arXiv Detail & Related papers (2023-12-06T10:10:21Z) - STANLEY: Stochastic Gradient Anisotropic Langevin Dynamics for Learning
Energy-Based Models [41.031470884141775]
We present an end-to-end learning algorithm for Energy-Based models (EBM)
We propose in this paper, a novel high dimensional sampling method, based on an anisotropic stepsize and a gradient-informed covariance matrix.
Our resulting method, namely STANLEY, is an optimization algorithm for training Energy-Based models via our newly introduced MCMC method.
arXiv Detail & Related papers (2023-10-19T11:55:16Z) - Learning Energy-Based Prior Model with Diffusion-Amortized MCMC [89.95629196907082]
Common practice of learning latent space EBMs with non-convergent short-run MCMC for prior and posterior sampling is hindering the model from further progress.
We introduce a simple but effective diffusion-based amortization method for long-run MCMC sampling and develop a novel learning algorithm for the latent space EBM based on it.
arXiv Detail & Related papers (2023-10-05T00:23:34Z) - MCMC-Correction of Score-Based Diffusion Models for Model Composition [2.682859657520006]
Diffusion models can be parameterised in terms of either a score or an energy function.
We propose keeping the score parameterisation and computing an acceptance probability inspired by energy-based models.
arXiv Detail & Related papers (2023-07-26T07:50:41Z) - Reduce, Reuse, Recycle: Compositional Generation with Energy-Based Diffusion Models and MCMC [102.64648158034568]
diffusion models have quickly become the prevailing approach to generative modeling in many domains.
We propose an energy-based parameterization of diffusion models which enables the use of new compositional operators.
We find these samplers lead to notable improvements in compositional generation across a wide set of problems.
arXiv Detail & Related papers (2023-02-22T18:48:46Z) - Learning Energy-Based Model with Variational Auto-Encoder as Amortized
Sampler [35.80109055748496]
Training energy-based models (EBMs) by maximum likelihood requires Markov chain Monte Carlo sampling.
We learn a variational auto-encoder (VAE) to initialize the finite-step MCMC, such as Langevin dynamics that is derived from the energy function.
With these amortized MCMC samples, the EBM can be trained by maximum likelihood, which follows an "analysis by synthesis" scheme.
We call this joint training algorithm the variational MCMC teaching, in which the VAE chases the EBM toward data distribution.
arXiv Detail & Related papers (2020-12-29T20:46:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.