Statistical guarantees for stochastic Metropolis-Hastings
- URL: http://arxiv.org/abs/2310.09335v1
- Date: Fri, 13 Oct 2023 18:00:26 GMT
- Title: Statistical guarantees for stochastic Metropolis-Hastings
- Authors: Sebastian Bieringer, Gregor Kasieczka, Maximilian F. Steffen and
Mathias Trabs
- Abstract summary: By calculating acceptance probabilities on batches, a Metropolis-Hastings step saves computational costs, but reduces the effective sample size.
We show that this obstacle can be avoided by a simple correction term.
We show that the Metropolis-Hastings algorithm indeed behave similar to those obtained from the classical Metropolis-adjusted Langevin algorithm.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Metropolis-Hastings step is widely used for gradient-based Markov chain
Monte Carlo methods in uncertainty quantification. By calculating acceptance
probabilities on batches, a stochastic Metropolis-Hastings step saves
computational costs, but reduces the effective sample size. We show that this
obstacle can be avoided by a simple correction term. We study statistical
properties of the resulting stationary distribution of the chain if the
corrected stochastic Metropolis-Hastings approach is applied to sample from a
Gibbs posterior distribution in a nonparametric regression setting. Focusing on
deep neural network regression, we prove a PAC-Bayes oracle inequality which
yields optimal contraction rates and we analyze the diameter and show high
coverage probability of the resulting credible sets. With a numerical example
in a high-dimensional parameter space, we illustrate that credible sets and
contraction rates of the stochastic Metropolis-Hastings algorithm indeed behave
similar to those obtained from the classical Metropolis-adjusted Langevin
algorithm.
Related papers
- Robust Stochastic Optimization via Gradient Quantile Clipping [1.90365714903665]
We introduce a quant clipping strategy for Gradient Descent (SGD)
We use gradient new outliers as norm clipping chains.
We propose an implementation of the algorithm using Huberiles.
arXiv Detail & Related papers (2023-09-29T15:24:48Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Semi-Parametric Inference for Doubly Stochastic Spatial Point Processes: An Approximate Penalized Poisson Likelihood Approach [3.085995273374333]
Doubly-stochastic point processes model the occurrence of events over a spatial domain as an inhomogeneous process conditioned on the realization of a random intensity function.
Existing implementations of doubly-stochastic spatial models are computationally demanding, often have limited theoretical guarantee, and/or rely on restrictive assumptions.
arXiv Detail & Related papers (2023-06-11T19:48:39Z) - Preferential Subsampling for Stochastic Gradient Langevin Dynamics [3.158346511479111]
gradient MCMC offers an unbiased estimate of the gradient of the log-posterior with a small, uniformly-weighted subsample of the data.
The resulting gradient estimator may exhibit a high variance and impact sampler performance.
We demonstrate that such an approach can maintain the same level of accuracy while substantially reducing the average subsample size that is used.
arXiv Detail & Related papers (2022-10-28T14:56:18Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Decentralized Bayesian Learning with Metropolis-Adjusted Hamiltonian
Monte Carlo [15.20294178835262]
We show that Langevin Hamiltonian methods are effective at realizing a gradient of a random quantity.
We present the first approach to incorporating constant step-size methods with a Metropolis- HMC.
arXiv Detail & Related papers (2021-07-15T09:39:14Z) - Stochastic Learning for Sparse Discrete Markov Random Fields with
Controlled Gradient Approximation Error [10.381976180143328]
We study the $L_$-regularized maximum likelihood estimator/estimation (MLE) problem for discrete Markov random fields (MRFs)
To address these challenges, we consider a verifiable learning framework called proximal gradient (SPG)
We provide novel verifiable bounds to inspect and control the quality of the gradient approximation.
arXiv Detail & Related papers (2020-05-12T22:48:42Z) - Batch Stationary Distribution Estimation [98.18201132095066]
We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions.
We propose a consistent estimator that is based on recovering a correction ratio function over the given data.
arXiv Detail & Related papers (2020-03-02T09:10:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.