AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic
Gradient MCMC
- URL: http://arxiv.org/abs/2003.00193v1
- Date: Sat, 29 Feb 2020 06:57:43 GMT
- Title: AMAGOLD: Amortized Metropolis Adjustment for Efficient Stochastic
Gradient MCMC
- Authors: Ruqi Zhang, A. Feder Cooper, Christopher De Sa
- Abstract summary: Hamiltonian Monte Carlo (SGHMC) is an efficient method for sampling from continuous distributions.
We propose a novel second-order SG-MCMC algorithm---AMAGOLD---that infrequently uses Metropolis-Hastings (M-H) corrections to remove bias.
We prove AMAGOLD converges to the target distribution with a fixed, rather than a diminishing, step size, and that its convergence rate is at most a constant factor slower than a full-batch baseline.
- Score: 37.768023232677244
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic gradient Hamiltonian Monte Carlo (SGHMC) is an efficient method
for sampling from continuous distributions. It is a faster alternative to HMC:
instead of using the whole dataset at each iteration, SGHMC uses only a
subsample. This improves performance, but introduces bias that can cause SGHMC
to converge to the wrong distribution. One can prevent this using a step size
that decays to zero, but such a step size schedule can drastically slow down
convergence. To address this tension, we propose a novel second-order SG-MCMC
algorithm---AMAGOLD---that infrequently uses Metropolis-Hastings (M-H)
corrections to remove bias. The infrequency of corrections amortizes their
cost. We prove AMAGOLD converges to the target distribution with a fixed,
rather than a diminishing, step size, and that its convergence rate is at most
a constant factor slower than a full-batch baseline. We empirically demonstrate
AMAGOLD's effectiveness on synthetic distributions, Bayesian logistic
regression, and Bayesian neural networks.
Related papers
- Persistent Sampling: Unleashing the Potential of Sequential Monte Carlo [0.0]
We introduce persistent sampling (PS), an extension of Sequential Monte Carlo (SMC) methods.
PS generates a growing, weighted ensemble of particles distributed across iterations.
PS consistently outperforms standard methods, achieving lower squared bias in posterior moment estimation.
arXiv Detail & Related papers (2024-07-30T10:34:40Z) - SpreadNUTS -- Moderate Dynamic Extension of Paths for No-U-Turn Sampling
& Partitioning Visited Regions [0.0]
This paper introduces modifications to a specific Hamiltonian Monte Carlo (HMC) algorithm known as the no-U-turn sampler (NUTS)
NUTS aims to explore the sample space faster than NUTS, yielding a sampler that has faster convergence to the true distribution than NUTS.
arXiv Detail & Related papers (2023-07-09T05:00:25Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior.
We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z) - Langevin Monte Carlo: random coordinate descent and variance reduction [7.464874233755718]
Langevin Monte Carlo (LMC) is a popular Bayesian sampling method.
We investigate how to enhance computational efficiency through the application of RCD (random coordinate descent) on LMC.
arXiv Detail & Related papers (2020-07-26T18:14:36Z) - Balancing Rates and Variance via Adaptive Batch-Size for Stochastic
Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error.
Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z) - Variance reduction for Random Coordinate Descent-Langevin Monte Carlo [7.464874233755718]
Langevin Monte Carlo (LMC) that provides fast convergence requires computation of gradient approximations.
In practice one uses finite-differencing approximations as surrogates, and the method is expensive in high-dimensions.
We introduce a new variance reduction approach, termed Coordinates Averaging Descent (RCAD), and incorporate it with both overdamped and underdamped LMC.
arXiv Detail & Related papers (2020-06-10T21:08:38Z) - Improving Sampling Accuracy of Stochastic Gradient MCMC Methods via
Non-uniform Subsampling of Gradients [54.90670513852325]
We propose a non-uniform subsampling scheme to improve the sampling accuracy.
EWSG is designed so that a non-uniform gradient-MCMC method mimics the statistical behavior of a batch-gradient-MCMC method.
In our practical implementation of EWSG, the non-uniform subsampling is performed efficiently via a Metropolis-Hastings chain on the data index.
arXiv Detail & Related papers (2020-02-20T18:56:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.