Bounding Wasserstein distance with couplings
- URL: http://arxiv.org/abs/2112.03152v3
- Date: Thu, 2 Nov 2023 04:00:31 GMT
- Title: Bounding Wasserstein distance with couplings
- Authors: Niloy Biswas and Lester Mackey
- Abstract summary: We propose estimators based on couplings of Markov chains to assess the quality of suchally biased sampling methods.
We establish theoretical guarantees for our upper bounds and show that our estimators can remain effective in high dimensions.
- Score: 26.17941985324059
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Markov chain Monte Carlo (MCMC) provides asymptotically consistent estimates
of intractable posterior expectations as the number of iterations tends to
infinity. However, in large data applications, MCMC can be computationally
expensive per iteration. This has catalyzed interest in approximating MCMC in a
manner that improves computational speed per iteration but does not produce
asymptotically consistent estimates. In this article, we propose estimators
based on couplings of Markov chains to assess the quality of such
asymptotically biased sampling methods. The estimators give empirical upper
bounds of the Wasserstein distance between the limiting distribution of the
asymptotically biased sampling method and the original target distribution of
interest. We establish theoretical guarantees for our upper bounds and show
that our estimators can remain effective in high dimensions. We apply our
quality measures to stochastic gradient MCMC, variational Bayes, and Laplace
approximations for tall data and to approximate MCMC for Bayesian logistic
regression in 4500 dimensions and Bayesian linear regression in 50000
dimensions.
Related papers
- A sparse PAC-Bayesian approach for high-dimensional quantile prediction [0.0]
This paper presents a novel probabilistic machine learning approach for high-dimensional quantile prediction.
It uses a pseudo-Bayesian framework with a scaled Student-t prior and Langevin Monte Carlo for efficient computation.
Its effectiveness is validated through simulations and real-world data, where it performs competitively against established frequentist and Bayesian techniques.
arXiv Detail & Related papers (2024-09-03T08:01:01Z) - Unbiased Kinetic Langevin Monte Carlo with Inexact Gradients [0.8749675983608172]
We present an unbiased method for posterior means based on kinetic Langevin dynamics.
Our proposed estimator is unbiased, attains finite variance, and satisfies a central limit theorem.
Our results demonstrate that in large-scale applications, the unbiased algorithm we present can be 2-3 orders of magnitude more efficient than the gold-standard" randomized Hamiltonian Monte Carlo.
arXiv Detail & Related papers (2023-11-08T21:19:52Z) - Statistical guarantees for stochastic Metropolis-Hastings [0.0]
By calculating acceptance probabilities on batches, a Metropolis-Hastings step saves computational costs, but reduces the effective sample size.
We show that this obstacle can be avoided by a simple correction term.
We show that the Metropolis-Hastings algorithm indeed behave similar to those obtained from the classical Metropolis-adjusted Langevin algorithm.
arXiv Detail & Related papers (2023-10-13T18:00:26Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Asymptotic bias of inexact Markov Chain Monte Carlo methods in high
dimension [0.7614628596146599]
Examples include the unadjusted Langevin (ULA) and unadjusted Hamiltonian Monte Carlo (uHMC)
We show that for both ULA and uHMC, the bias depends on key quantities related to the target distribution or the stationary probability measure of the scheme.
arXiv Detail & Related papers (2021-08-02T07:34:09Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - An adaptive Hessian approximated stochastic gradient MCMC method [12.93317525451798]
We present an adaptive Hessian approximated gradient MCMC method to incorporate local geometric information while sampling from the posterior.
We adopt a magnitude-based weight pruning method to enforce the sparsity of the network.
arXiv Detail & Related papers (2020-10-03T16:22:15Z) - $\gamma$-ABC: Outlier-Robust Approximate Bayesian Computation Based on a
Robust Divergence Estimator [95.71091446753414]
We propose to use a nearest-neighbor-based $gamma$-divergence estimator as a data discrepancy measure.
Our method achieves significantly higher robustness than existing discrepancy measures.
arXiv Detail & Related papers (2020-06-13T06:09:27Z) - Batch Stationary Distribution Estimation [98.18201132095066]
We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions.
We propose a consistent estimator that is based on recovering a correction ratio function over the given data.
arXiv Detail & Related papers (2020-03-02T09:10:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.