Related papers: Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks with Symmetric Splitting

Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks with Symmetric Splitting

URL: http://arxiv.org/abs/2010.06772v1
Date: Wed, 14 Oct 2020 01:58:34 GMT
Title: Scaling Hamiltonian Monte Carlo Inference for Bayesian Neural Networks with Symmetric Splitting
Authors: Adam D. Cobb, Brian Jalaian
Abstract summary: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo approach that exhibits favourable exploration properties in high-dimensional models such as neural networks. We introduce a new integration scheme for split HMC that does not rely on symmetric gradients. Our approach demonstrates HMC as a feasible option when considering inference schemes for large-scale machine learning problems.
Score: 6.684193501969829
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Hamiltonian Monte Carlo (HMC) is a Markov chain Monte Carlo (MCMC) approach that exhibits favourable exploration properties in high-dimensional models such as neural networks. Unfortunately, HMC has limited use in large-data regimes and little work has explored suitable approaches that aim to preserve the entire Hamiltonian. In our work, we introduce a new symmetric integration scheme for split HMC that does not rely on stochastic gradients. We show that our new formulation is more efficient than previous approaches and is easy to implement with a single GPU. As a result, we are able to perform full HMC over common deep learning architectures using entire data sets. In addition, when we compare with stochastic gradient MCMC, we show that our method achieves better performance in both accuracy and uncertainty quantification. Our approach demonstrates HMC as a feasible option when considering inference schemes for large-scale machine learning problems.

Related papers

Humble your Overconfident Networks: Unlearning Overfitting via Sequential Monte Carlo Tempered Deep Ensembles [3.2254941904559917]
We introduce a scalable variant by incorporating Gradient Hamiltonian Monte Carlo proposals into Sequential Monte Carlo (SMC) methods.<n>Our resulting SMCSGHMC algorithm outperforms gradient descent ensembles across image classification, out-of-distribution detection, and transfer learning tasks.
arXiv Detail & Related papers (2025-05-16T20:10:04Z)
Parameter Expanded Stochastic Gradient Markov Chain Monte Carlo [32.46884330460211]
We propose a simple yet effective approach to enhance sample diversity in Gradient Markov Chain Monte Carlo. This approach produces a more diverse set of samples, allowing faster mixing within the same computational budget. Our experiments on image classification tasks, including OOD robustness, diversity, loss surface analyses, and a comparative study with Hamiltonian Monte Carlo, demonstrate the superiority of the proposed approach.
arXiv Detail & Related papers (2025-03-02T02:42:50Z)
Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference. Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z)
Entropy-based adaptive Hamiltonian Monte Carlo [19.358300726820943]
Hamiltonian Monte Carlo (HMC) is a popular Markov Chain Monte Carlo (MCMC) algorithm to sample from an unnormalized probability distribution. A leapfrog integrator is commonly used to implement HMC in practice, but its performance can be sensitive to the choice of mass matrix used. We develop a gradient-based algorithm that allows for the adaptation of the mass matrix by encouraging the leapfrog integrator to have high acceptance rates.
arXiv Detail & Related papers (2021-10-27T17:52:55Z)
DG-LMC: A Turn-key and Scalable Synchronous Distributed MCMC Algorithm [21.128416842467132]
We derive a user-friendly centralised distributed MCMC algorithm with provable scaling in high-dimensional settings. We illustrate the relevance of the proposed methodology on both synthetic and real data experiments.
arXiv Detail & Related papers (2021-06-11T10:37:14Z)
What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles. We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z)
Sampling in Combinatorial Spaces with SurVAE Flow Augmented MCMC [83.48593305367523]
Hybrid Monte Carlo is a powerful Markov Chain Monte Carlo method for sampling from complex continuous distributions. We introduce a new approach based on augmenting Monte Carlo methods with SurVAE Flows to sample from discrete distributions. We demonstrate the efficacy of our algorithm on a range of examples from statistics, computational physics and machine learning, and observe improvements compared to alternative algorithms.
arXiv Detail & Related papers (2021-02-04T02:21:08Z)
Scaling Hidden Markov Language Models [118.55908381553056]
This work revisits the challenge of scaling HMMs to language modeling datasets. We propose methods for scaling HMMs to massive state spaces while maintaining efficient exact inference, a compact parameterization, and effective regularization.
arXiv Detail & Related papers (2020-11-09T18:51:55Z)
An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior. We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z)
Non-convex Learning via Replica Exchange Stochastic Gradient MCMC [25.47669573608621]
We propose an adaptive replica exchange SGMCMC (reSGMCMC) to automatically correct the bias and study the corresponding properties. Empirically, we test the algorithm through extensive experiments on various setups and obtain the results.
arXiv Detail & Related papers (2020-08-12T15:02:59Z)
MMCGAN: Generative Adversarial Network with Explicit Manifold Prior [78.58159882218378]
We propose to employ explicit manifold learning as prior to alleviate mode collapse and stabilize training of GAN. Our experiments on both the toy data and real datasets show the effectiveness of MMCGAN in alleviating mode collapse, stabilizing training, and improving the quality of generated samples.
arXiv Detail & Related papers (2020-06-18T07:38:54Z)
Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy. EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method. In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.