Learning variational autoencoders via MCMC speed measures
- URL: http://arxiv.org/abs/2308.13731v1
- Date: Sat, 26 Aug 2023 02:15:51 GMT
- Title: Learning variational autoencoders via MCMC speed measures
- Authors: Marcel Hirt, Vasileios Kreouzis, Petros Dellaportas
- Abstract summary: Variational autoencoders (VAEs) are popular likelihood-based generative models.
This work suggests an entropy-based adaptation for a short-run Metropolis-adjusted Langevin (MALA) or Hamiltonian Monte Carlo (HMC) chain.
Experiments show that this approach yields higher held-out log-likelihoods as well as improved generative metrics.
- Score: 7.688686113950604
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Variational autoencoders (VAEs) are popular likelihood-based generative
models which can be efficiently trained by maximizing an Evidence Lower Bound
(ELBO). There has been much progress in improving the expressiveness of the
variational distribution to obtain tighter variational bounds and increased
generative performance. Whilst previous work has leveraged Markov chain Monte
Carlo (MCMC) methods for the construction of variational densities,
gradient-based methods for adapting the proposal distributions for deep latent
variable models have received less attention. This work suggests an
entropy-based adaptation for a short-run Metropolis-adjusted Langevin (MALA) or
Hamiltonian Monte Carlo (HMC) chain while optimising a tighter variational
bound to the log-evidence. Experiments show that this approach yields higher
held-out log-likelihoods as well as improved generative metrics. Our implicit
variational density can adapt to complicated posterior geometries of latent
hierarchical representations arising in hierarchical VAEs.
Related papers
- MARS: Unleashing the Power of Variance Reduction for Training Large Models [56.47014540413659]
Large gradient algorithms like Adam, Adam, and their variants have been central to the development of this type of training.
We propose a framework that reconciles preconditioned gradient optimization methods with variance reduction via a scaled momentum technique.
arXiv Detail & Related papers (2024-11-15T18:57:39Z) - Variational Learning of Gaussian Process Latent Variable Models through Stochastic Gradient Annealed Importance Sampling [22.256068524699472]
In this work, we propose an Annealed Importance Sampling (AIS) approach to address these issues.
We combine the strengths of Sequential Monte Carlo samplers and VI to explore a wider range of posterior distributions and gradually approach the target distribution.
Experimental results on both toy and image datasets demonstrate that our method outperforms state-of-the-art methods in terms of tighter variational bounds, higher log-likelihoods, and more robust convergence.
arXiv Detail & Related papers (2024-08-13T08:09:05Z) - Differentiating Metropolis-Hastings to Optimize Intractable Densities [51.16801956665228]
We develop an algorithm for automatic differentiation of Metropolis-Hastings samplers.
We apply gradient-based optimization to objectives expressed as expectations over intractable target densities.
arXiv Detail & Related papers (2023-06-13T17:56:02Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - Improving Covariance Conditioning of the SVD Meta-layer by Orthogonality [65.67315418971688]
Nearest Orthogonal Gradient (NOG) and Optimal Learning Rate (OLR) are proposed.
Experiments on visual recognition demonstrate that our methods can simultaneously improve the covariance conditioning and generalization.
arXiv Detail & Related papers (2022-07-05T15:39:29Z) - Entropy-based adaptive Hamiltonian Monte Carlo [19.358300726820943]
Hamiltonian Monte Carlo (HMC) is a popular Markov Chain Monte Carlo (MCMC) algorithm to sample from an unnormalized probability distribution.
A leapfrog integrator is commonly used to implement HMC in practice, but its performance can be sensitive to the choice of mass matrix used.
We develop a gradient-based algorithm that allows for the adaptation of the mass matrix by encouraging the leapfrog integrator to have high acceptance rates.
arXiv Detail & Related papers (2021-10-27T17:52:55Z) - Structured Stochastic Gradient MCMC [20.68905354115655]
We propose a new non-parametric variational approximation that makes no assumptions about the approximate posterior's functional form.
We obtain better predictive likelihoods and larger effective sample sizes than full SGMCMC.
arXiv Detail & Related papers (2021-07-19T17:18:10Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - MCMC-Interactive Variational Inference [56.58416764959414]
We propose MCMC-interactive variational inference (MIVI) to estimate the posterior in a time constrained manner.
MIVI takes advantage of the complementary properties of variational inference and MCMC to encourage mutual improvement.
Experiments show that MIVI not only accurately approximates the posteriors but also facilitates designs of gradient MCMC and Gibbs sampling transitions.
arXiv Detail & Related papers (2020-10-02T17:43:20Z) - Quasi-symplectic Langevin Variational Autoencoder [7.443843354775884]
Variational autoencoder (VAE) is a very popular and well-investigated generative model in neural learning research.
It is required to deal with the difficulty of building low variance evidence lower bounds (ELBO)
arXiv Detail & Related papers (2020-09-02T12:13:27Z) - Sparse Gaussian Processes Revisited: Bayesian Approaches to
Inducing-Variable Approximations [27.43948386608]
Variational inference techniques based on inducing variables provide an elegant framework for scalable estimation in Gaussian process (GP) models.
In this work we challenge the common wisdom that optimizing the inducing inputs in variational framework yields optimal performance.
arXiv Detail & Related papers (2020-03-06T08:53:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.