Statistical guarantees for stochastic Metropolis-Hastings
- URL: http://arxiv.org/abs/2310.09335v1
- Date: Fri, 13 Oct 2023 18:00:26 GMT
- Title: Statistical guarantees for stochastic Metropolis-Hastings
- Authors: Sebastian Bieringer, Gregor Kasieczka, Maximilian F. Steffen and
Mathias Trabs
- Abstract summary: By calculating acceptance probabilities on batches, a Metropolis-Hastings step saves computational costs, but reduces the effective sample size.
We show that this obstacle can be avoided by a simple correction term.
We show that the Metropolis-Hastings algorithm indeed behave similar to those obtained from the classical Metropolis-adjusted Langevin algorithm.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A Metropolis-Hastings step is widely used for gradient-based Markov chain
Monte Carlo methods in uncertainty quantification. By calculating acceptance
probabilities on batches, a stochastic Metropolis-Hastings step saves
computational costs, but reduces the effective sample size. We show that this
obstacle can be avoided by a simple correction term. We study statistical
properties of the resulting stationary distribution of the chain if the
corrected stochastic Metropolis-Hastings approach is applied to sample from a
Gibbs posterior distribution in a nonparametric regression setting. Focusing on
deep neural network regression, we prove a PAC-Bayes oracle inequality which
yields optimal contraction rates and we analyze the diameter and show high
coverage probability of the resulting credible sets. With a numerical example
in a high-dimensional parameter space, we illustrate that credible sets and
contraction rates of the stochastic Metropolis-Hastings algorithm indeed behave
similar to those obtained from the classical Metropolis-adjusted Langevin
algorithm.
Related papers
- Likelihood-Free Adaptive Bayesian Inference via Nonparametric Distribution Matching [2.0319002824093015]
We propose Adaptive Bayesian Inference (ABI), a framework that bypasses traditional data-space discrepancies.<n>ABI transforms the problem of measuring divergence between posterior distributions into a tractable sequence of conditional quantile regression tasks.<n>We demonstrate that ABI significantly outperforms data-based Wasserstein, summary-based ABC, and state-of-the-art likelihood-free simulators.
arXiv Detail & Related papers (2025-05-07T17:50:14Z) - pcaGAN: Improving Posterior-Sampling cGANs via Principal Component Regularization [11.393603788068777]
In ill-posed imaging inverse problems, there can exist many hypotheses that fit both the observed measurements and prior knowledge of the true image.
We propose a fast and accurate posterior-sampling conditional generative adversarial network (cGAN) that, through a novel form of regularization, aims for correctness in the posterior mean.
arXiv Detail & Related papers (2024-11-01T14:09:28Z) - Constrained Sampling with Primal-Dual Langevin Monte Carlo [15.634831573546041]
This work considers the problem of sampling from a probability distribution known up to a normalization constant.
It satisfies a set of statistical constraints specified by the expected values of general nonlinear functions.
We put forward a discrete-time primal-dual Langevin Monte Carlo algorithm (PD-LMC) that simultaneously constrains the target distribution and samples from it.
arXiv Detail & Related papers (2024-11-01T13:26:13Z) - NETS: A Non-Equilibrium Transport Sampler [15.58993313831079]
We propose an algorithm, termed the Non-Equilibrium Transport Sampler (NETS)
NETS can be viewed as a variant of importance sampling (AIS) based on Jarzynski's equality.
We show that this drift is the minimizer of a variety of objective functions, which can all be estimated in an unbiased fashion.
arXiv Detail & Related papers (2024-10-03T17:35:38Z) - A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction.
It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation.
Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z) - Amortizing intractable inference in diffusion models for vision, language, and control [89.65631572949702]
This paper studies amortized sampling of the posterior over data, $mathbfxsim prm post(mathbfx)propto p(mathbfx)r(mathbfx)$, in a model that consists of a diffusion generative model prior $p(mathbfx)$ and a black-box constraint or function $r(mathbfx)$.<n>We prove the correctness of a data-free learning objective, relative trajectory balance, for training a diffusion model that samples from
arXiv Detail & Related papers (2024-05-31T16:18:46Z) - Calibrating Neural Simulation-Based Inference with Differentiable
Coverage Probability [50.44439018155837]
We propose to include a calibration term directly into the training objective of the neural model.
By introducing a relaxation of the classical formulation of calibration error we enable end-to-end backpropagation.
It is directly applicable to existing computational pipelines allowing reliable black-box posterior inference.
arXiv Detail & Related papers (2023-10-20T10:20:45Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Semi-Parametric Inference for Doubly Stochastic Spatial Point Processes: An Approximate Penalized Poisson Likelihood Approach [3.085995273374333]
Doubly-stochastic point processes model the occurrence of events over a spatial domain as an inhomogeneous process conditioned on the realization of a random intensity function.
Existing implementations of doubly-stochastic spatial models are computationally demanding, often have limited theoretical guarantee, and/or rely on restrictive assumptions.
arXiv Detail & Related papers (2023-06-11T19:48:39Z) - Preferential Subsampling for Stochastic Gradient Langevin Dynamics [3.158346511479111]
gradient MCMC offers an unbiased estimate of the gradient of the log-posterior with a small, uniformly-weighted subsample of the data.
The resulting gradient estimator may exhibit a high variance and impact sampler performance.
We demonstrate that such an approach can maintain the same level of accuracy while substantially reducing the average subsample size that is used.
arXiv Detail & Related papers (2022-10-28T14:56:18Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Inverting brain grey matter models with likelihood-free inference: a
tool for trustable cytoarchitecture measurements [62.997667081978825]
characterisation of the brain grey matter cytoarchitecture with quantitative sensitivity to soma density and volume remains an unsolved challenge in dMRI.
We propose a new forward model, specifically a new system of equations, requiring a few relatively sparse b-shells.
We then apply modern tools from Bayesian analysis known as likelihood-free inference (LFI) to invert our proposed model.
arXiv Detail & Related papers (2021-11-15T09:08:27Z) - Heavy-tailed Streaming Statistical Estimation [58.70341336199497]
We consider the task of heavy-tailed statistical estimation given streaming $p$ samples.
We design a clipped gradient descent and provide an improved analysis under a more nuanced condition on the noise of gradients.
arXiv Detail & Related papers (2021-08-25T21:30:27Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - What Are Bayesian Neural Network Posteriors Really Like? [63.950151520585024]
We show that Hamiltonian Monte Carlo can achieve significant performance gains over standard and deep ensembles.
We also show that deep distributions are similarly close to HMC as standard SGLD, and closer than standard variational inference.
arXiv Detail & Related papers (2021-04-29T15:38:46Z) - Sampling-free Variational Inference for Neural Networks with
Multiplicative Activation Noise [51.080620762639434]
We propose a more efficient parameterization of the posterior approximation for sampling-free variational inference.
Our approach yields competitive results for standard regression problems and scales well to large-scale image classification tasks.
arXiv Detail & Related papers (2021-03-15T16:16:18Z) - Variational Laplace for Bayesian neural networks [25.055754094939527]
Variational Laplace exploits a local approximation of the likelihood to estimate the ELBO without the need for sampling the neural-network weights.
We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2021-02-27T14:06:29Z) - Variational Laplace for Bayesian neural networks [33.46810568687292]
We develop variational Laplace for Bayesian neural networks (BNNs)
We exploit a local approximation of the curvature of the likelihood to estimate the ELBO without the need for sampling the neural-network weights.
We show that early-stopping can be avoided by increasing the learning rate for the variance parameters.
arXiv Detail & Related papers (2020-11-20T15:16:18Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Bayesian Neural Network via Stochastic Gradient Descent [0.0]
We show how gradient estimation can be applied on bayesian neural networks by gradient estimation techniques.
Our work considerably beats the previous state of the art approaches for regression using bayesian neural networks.
arXiv Detail & Related papers (2020-06-04T18:33:59Z) - Batch Stationary Distribution Estimation [98.18201132095066]
We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions.
We propose a consistent estimator that is based on recovering a correction ratio function over the given data.
arXiv Detail & Related papers (2020-03-02T09:10:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.