Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks
with Symmetric Splitting
- URL: http://arxiv.org/abs/2010.06772v1
- Date: Wed, 14 Oct 2020 01:58:34 GMT
- Title: Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks
with Symmetric Splitting
- Authors: Adam D. Cobb, Brian Jalaian
- Abstract summary: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo approach that exhibits favourable exploration properties in high-dimensional models such as neural networks.
We introduce a new integration scheme for split HMC that does not rely on symmetric gradients.
Our approach demonstrates HMC as a feasible option when considering inference schemes for large-scale machine learning problems.
- Score: 6.684193501969829
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) approach
that exhibits favourable exploration properties in high-dimensional models such
as neural networks. Unfortunately, HMC has limited use in large-data regimes
and little work has explored suitable approaches that aim to preserve the
entire Hamiltonian. In our work, we introduce a new symmetric integration
scheme for split HMC that does not rely on stochastic gradients. We show that
our new formulation is more efficient than previous approaches and is easy to
implement with a single GPU. As a result, we are able to perform full HMC over
common deep learning architectures using entire data sets. In addition, when we
compare with stochastic gradient MCMC, we show that our method achieves better
performance in both accuracy and uncertainty quantification. Our approach
demonstrates HMC as a feasible option when considering inference schemes for
large-scale machine learning problems.
Related papers
- Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - Entropy-based adaptive Hamiltonian Monte Carlo [19.358300726820943]
Hamiltonian Monte Carlo (HMC) is a popular Markov Chain Monte Carlo (MCMC) algorithm to sample from an unnormalized probability distribution.
A leapfrog integrator is commonly used to implement HMC in practice, but its performance can be sensitive to the choice of mass matrix used.
We develop a gradient-based algorithm that allows for the adaptation of the mass matrix by encouraging the leapfrog integrator to have high acceptance rates.
arXiv Detail & Related papers (2021-10-27T17:52:55Z) - DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm [21.128416842467132]
We derive a user-friendly centralised distributed MCMC algorithm with provable scaling in high-dimensional settings.
We illustrate the relevance of the proposed methodology on both synthetic and real data experiments.
arXiv Detail & Related papers (2021-06-11T10:37:14Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions.
We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions.
We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z) - Scaling Hidden Markov Language Models [118.55908381553056]
This work revisits the challenge of scaling HMMs to language modeling datasets.
We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization.
arXiv Detail & Related papers (2020-11-09T18:51:55Z) - An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior.
We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z) - Non-convex Learning via Replica Exchange Stochastic Gradient MCMC [25.47669573608621]
We propose an adaptive replica exchange SGMCMC (reSGMCMC) to automatically correct the bias and study the corresponding properties.
Empirically, we test the algorithm through extensive experiments on various setups and obtain the results.
arXiv Detail & Related papers (2020-08-12T15:02:59Z) - MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN.
Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z) - Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via
Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy.
EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method.
In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.