Energy Discrepancies: A Score-Independent Loss for Energy-Based Models
- URL: http://arxiv.org/abs/2307.06431v2
- Date: Mon, 27 Nov 2023 15:38:32 GMT
- Title: Energy Discrepancies: A Score-Independent Loss for Energy-Based Models
- Authors: Tobias Schr\"oder, Zijing Ou, Jen Ning Lim, Yingzhen Li, Sebastian J.
Vollmer, Andrew B. Duncan
- Abstract summary: We propose a novel loss function called Energy Discrepancy (ED) which does not rely on the computation of scores or expensive Markov chain Monte Carlo.
We show that ED approaches the explicit score matching and negative log-likelihood loss under different limits, effectively interpolating between both.
- Score: 20.250792836049882
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Energy-based models are a simple yet powerful class of probabilistic models,
but their widespread adoption has been limited by the computational burden of
training them. We propose a novel loss function called Energy Discrepancy (ED)
which does not rely on the computation of scores or expensive Markov chain
Monte Carlo. We show that ED approaches the explicit score matching and
negative log-likelihood loss under different limits, effectively interpolating
between both. Consequently, minimum ED estimation overcomes the problem of
nearsightedness encountered in score-based estimation methods, while also
enjoying theoretical guarantees. Through numerical experiments, we demonstrate
that ED learns low-dimensional data distributions faster and more accurately
than explicit score matching or contrastive divergence. For high-dimensional
image data, we describe how the manifold hypothesis puts limitations on our
approach and demonstrate the effectiveness of energy discrepancy by training
the energy-based model as a prior of a variational decoder model.
Related papers
- Adaptive Resolution Inference (ARI): Energy-Efficient Machine Learning for Internet of Things [11.802983172874901]
The implementation of machine learning in Internet of Things devices poses significant operational challenges due to limited energy and computation resources.
We present adaptive resolution inference (ARI), a novel approach that enables to evaluate new tradeoffs between energy dissipation and model performance.
arXiv Detail & Related papers (2024-08-26T16:00:26Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - Training Discrete Energy-Based Models with Energy Discrepancy [18.161764926125066]
Training energy-based models (EBMs) on discrete spaces is challenging because sampling over such spaces can be difficult.
We propose to train discrete EBMs with energy discrepancy (ED), a novel type of contrastive loss functional.
We demonstrate their relative performance on lattice Ising models, binary synthetic data, and discrete image data sets.
arXiv Detail & Related papers (2023-07-14T19:38:05Z) - Efficient Training of Energy-Based Models Using Jarzynski Equality [13.636994997309307]
Energy-based models (EBMs) are generative models inspired by statistical physics.
The computation of its gradient with respect to the model parameters requires sampling the model distribution.
Here we show how results for nonequilibrium thermodynamics based on Jarzynski equality can be used to perform this computation efficiently.
arXiv Detail & Related papers (2023-05-30T21:07:52Z) - End-To-End Latent Variational Diffusion Models for Inverse Problems in
High Energy Physics [61.44793171735013]
We introduce a novel unified architecture, termed latent variation models, which combines the latent learning of cutting-edge generative art approaches with an end-to-end variational framework.
Our unified approach achieves a distribution-free distance to the truth of over 20 times less than non-latent state-of-the-art baseline.
arXiv Detail & Related papers (2023-05-17T17:43:10Z) - How Much is Enough? A Study on Diffusion Times in Score-based Generative
Models [76.76860707897413]
Current best practice advocates for a large T to ensure that the forward dynamics brings the diffusion sufficiently close to a known and simple noise distribution.
We show how an auxiliary model can be used to bridge the gap between the ideal and the simulated forward dynamics, followed by a standard reverse diffusion process.
arXiv Detail & Related papers (2022-06-10T15:09:46Z) - The Franz-Parisi Criterion and Computational Trade-offs in High
Dimensional Statistics [73.1090889521313]
This paper aims to make a rigorous connection between the different low-degree and free-energy based approaches.
We define a free-energy based criterion for hardness and formally connect it to the well-established notion of low-degree hardness.
Results provide both conceptual insights into the connections between different notions of hardness, as well as concrete technical tools.
arXiv Detail & Related papers (2022-05-19T17:39:29Z) - Pseudo-Spherical Contrastive Divergence [119.28384561517292]
We propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum learning likelihood of energy-based models.
PS-CD avoids the intractable partition function and provides a generalized family of learning objectives.
arXiv Detail & Related papers (2021-11-01T09:17:15Z) - Targeted free energy estimation via learned mappings [66.20146549150475]
Free energy perturbation (FEP) was proposed by Zwanzig more than six decades ago as a method to estimate free energy differences.
FEP suffers from a severe limitation: the requirement of sufficient overlap between distributions.
One strategy to mitigate this problem, called Targeted Free Energy Perturbation, uses a high-dimensional mapping in configuration space to increase overlap.
arXiv Detail & Related papers (2020-02-12T11:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.