Related papers: On the Convergence of the ELBO to Entropy Sums

On the Convergence of the ELBO to Entropy Sums

URL: http://arxiv.org/abs/2209.03077v6
Date: Tue, 24 Dec 2024 02:08:11 GMT
Title: On the Convergence of the ELBO to Entropy Sums
Authors: Jörg Lücke, Jan Warnken,
Abstract summary: We show that the variational lower bound is at all stationary points of learning equal to a sum of entropies.<n>Proving equality of the ELBO to entropy sums at stationary points is the main contribution of this work.
Score: 3.345575993695074
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The variational lower bound (a.k.a. ELBO or free energy) is the central objective for many established as well as for many novel algorithms for unsupervised learning. Such algorithms usually increase the bound until parameters have converged to values close to a stationary point of the learning dynamics. Here we show that (for a very large class of generative models) the variational lower bound is at all stationary points of learning equal to a sum of entropies. Concretely, for standard generative models with one set of latents and one set of observed variables, the sum consists of three entropies: (A) the (average) entropy of the variational distributions, (B) the negative entropy of the model's prior distribution, and (C) the (expected) negative entropy of the observable distribution. The obtained result applies under realistic conditions including: finite numbers of data points, at any stationary point (including saddle points) and for any family of (well behaved) variational distributions. The class of generative models for which we show the equality to entropy sums contains many standard as well as novel generative models including standard (Gaussian) variational autoencoders. The prerequisites we use to show equality to entropy sums are relatively mild. Concretely, the distributions defining a given generative model have to be of the exponential family, and the model has to satisfy a parameterization criterion (which is usually fulfilled). Proving equality of the ELBO to entropy sums at stationary points (under the stated conditions) is the main contribution of this work.

Related papers

Generative Models with ELBOs Converging to Entropy Sums [12.962371546573229]
Evidence lower bound (ELBO) is one of the most central objectives for unsupervised learning. We prove convergence to entropy sums for several generative models and model classes.
arXiv Detail & Related papers (2024-12-25T15:47:23Z)
Learning Sparse Codes with Entropy-Based ELBOs [13.906627869457232]
We derive a solely entropy-based learning objective for the parameters of standard sparse coding. The novel variational objective has the following features: (A) unlike MAP approximations, it uses non-trivial posterior approximations for probabilistic inference.
arXiv Detail & Related papers (2023-11-03T13:03:41Z)
Variational Microcanonical Estimator [0.0]
We propose a variational quantum algorithm for estimating microcanonical expectation values in models obeying the eigenstate thermalization hypothesis. An ensemble of variational states is then used to estimate microcanonical averages of local operators.
arXiv Detail & Related papers (2023-01-10T18:53:24Z)
Statistical Properties of the Entropy from Ordinal Patterns [55.551675080361335]
Knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date. We characterize the distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one. We present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy.
arXiv Detail & Related papers (2022-09-15T23:55:58Z)
On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data. Invariance measures consistency of model predictions on transformations of the data. From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z)
Entropy Production and the Role of Correlations in Quantum Brownian Motion [77.34726150561087]
We perform a study on quantum entropy production, different kinds of correlations, and their interplay in the driven Caldeira-Leggett model of quantum Brownian motion.
arXiv Detail & Related papers (2021-08-05T13:11:05Z)
Loss function based second-order Jensen inequality and its application to particle variational inference [112.58907653042317]
Particle variational inference (PVI) uses an ensemble of models as an empirical approximation for the posterior distribution. PVI iteratively updates each model with a repulsion force to ensure the diversity of the optimized models. We derive a novel generalization error bound and show that it can be reduced by enhancing the diversity of models.
arXiv Detail & Related papers (2021-06-09T12:13:51Z)
Spectral clustering under degree heterogeneity: a case for the random walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree. In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z)
The ELBO of Variational Autoencoders Converges to a Sum of Three Entropies [16.119724102324934]
The central objective function of a variational autoencoder (VAE) is its variational lower bound (the ELBO) Here we show that for standard (i.e., Gaussian) VAEs the ELBO converges to a value given by the sum of three entropies. Our derived analytical results are exact and apply for small as well as for intricate deep networks for encoder and decoder.
arXiv Detail & Related papers (2020-10-28T10:13:28Z)
Flexible mean field variational inference using mixtures of non-overlapping exponential families [6.599344783327053]
I show that using standard mean field variational inference can fail to produce sensible results for models with sparsity-inducing priors. I show that any mixture of a diffuse exponential family and a point mass at zero to model sparsity forms an exponential family.
arXiv Detail & Related papers (2020-10-14T01:46:56Z)
GANs with Variational Entropy Regularizers: Applications in Mitigating the Mode-Collapse Issue [95.23775347605923]
Building on the success of deep learning, Generative Adversarial Networks (GANs) provide a modern approach to learn a probability distribution from observed samples. GANs often suffer from the mode collapse issue where the generator fails to capture all existing modes of the input distribution. We take an information-theoretic approach and maximize a variational lower bound on the entropy of the generated samples to increase their diversity.
arXiv Detail & Related papers (2020-09-24T19:34:37Z)
Graph Gamma Process Generalized Linear Dynamical Systems [60.467040479276704]
We introduce graph gamma process (GGP) linear dynamical systems to model real multivariate time series. For temporal pattern discovery, the latent representation under the model is used to decompose the time series into a parsimonious set of multivariate sub-sequences. We use the generated random graph, whose number of nonzero-degree nodes is finite, to define both the sparsity pattern and dimension of the latent state transition matrix.
arXiv Detail & Related papers (2020-07-25T04:16:34Z)
Eigenstate Entanglement Entropy in Random Quadratic Hamiltonians [0.0]
In integrable models, the volume-law coefficient depends on the subsystem fraction. We show that the average entanglement entropy of eigenstates of the power-law random banded matrix model is close but not the same as the result for quadratic models.
arXiv Detail & Related papers (2020-06-19T18:01:15Z)
Generalized Entropy Regularization or: There's Nothing Special about Label Smoothing [83.78668073898001]
We introduce a family of entropy regularizers, which includes label smoothing as a special case. We find that variance in model performance can be explained largely by the resulting entropy of the model. We advise the use of other entropy regularization methods in its place.
arXiv Detail & Related papers (2020-05-02T12:46:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.