Stochastic Approximation with Biased MCMC for Expectation Maximization
- URL: http://arxiv.org/abs/2402.17870v1
- Date: Tue, 27 Feb 2024 20:10:03 GMT
- Title: Stochastic Approximation with Biased MCMC for Expectation Maximization
- Authors: Samuel Gruffaz, Kyurae Kim, Alain Oliviero Durmus, Jacob R. Gardner
- Abstract summary: An approximation scheme with Markov chain Monte Carlo can be used to create an algorithm known as MCMC-SAEM.
MCMC-SAEM is often run withally biased MCMC algorithms, for which the consequences are theoretically less understood.
We show that ULA is more stable with respect to the choice of Langevin stepsize and can sometimes result in faster convergence.
- Score: 9.4641739998927
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The expectation maximization (EM) algorithm is a widespread method for
empirical Bayesian inference, but its expectation step (E-step) is often
intractable. Employing a stochastic approximation scheme with Markov chain
Monte Carlo (MCMC) can circumvent this issue, resulting in an algorithm known
as MCMC-SAEM. While theoretical guarantees for MCMC-SAEM have previously been
established, these results are restricted to the case where asymptotically
unbiased MCMC algorithms are used. In practice, MCMC-SAEM is often run with
asymptotically biased MCMC, for which the consequences are theoretically less
understood. In this work, we fill this gap by analyzing the asymptotics and
non-asymptotics of SAEM with biased MCMC steps, particularly the effect of
bias. We also provide numerical experiments comparing the Metropolis-adjusted
Langevin algorithm (MALA), which is asymptotically unbiased, and the unadjusted
Langevin algorithm (ULA), which is asymptotically biased, on synthetic and real
datasets. Experimental results show that ULA is more stable with respect to the
choice of Langevin stepsize and can sometimes result in faster convergence.
Related papers
- Asymptotically Optimal Change Detection for Unnormalized Pre- and Post-Change Distributions [65.38208224389027]
This paper addresses the problem of detecting changes when only unnormalized pre- and post-change distributions are accessible.
Our approach is based on the estimation of the Cumulative Sum statistics, which is known to produce optimal performance.
arXiv Detail & Related papers (2024-10-18T17:13:29Z) - On Cyclical MCMC Sampling [13.002470980542707]
We show that cyclical MCMC converges to the desired probability distribution in settings where the Markov kernels used are fast mixing.
We also show that cyclical MCMC estimates well the local shape of the target distribution around each mode, even when we do not have convergence to the target.
arXiv Detail & Related papers (2024-03-01T02:20:44Z) - Online Variational Sequential Monte Carlo [49.97673761305336]
We build upon the variational sequential Monte Carlo (VSMC) method, which provides computationally efficient and accurate model parameter estimation and Bayesian latent-state inference.
Online VSMC is capable of performing efficiently, entirely on-the-fly, both parameter estimation and particle proposal adaptation.
arXiv Detail & Related papers (2023-12-19T21:45:38Z) - SpreadNUTS -- Moderate Dynamic Extension of Paths for No-U-Turn Sampling
& Partitioning Visited Regions [0.0]
This paper introduces modifications to a specific Hamiltonian Monte Carlo (HMC) algorithm known as the no-U-turn sampler (NUTS)
NUTS aims to explore the sample space faster than NUTS, yielding a sampler that has faster convergence to the true distribution than NUTS.
arXiv Detail & Related papers (2023-07-09T05:00:25Z) - Convergence Analysis of Stochastic Gradient Descent with MCMC Estimators [8.493584276672971]
gradient descent (SGD) and its variants is essential for machine learning.
In this paper, we consider the SGD algorithm that employ the Markov Chain Monte Carlo (MCMC) estimator to compute the gradient.
It is shown that MCMC-SGD escapes from saddle points and reaches $(epsilon,epsilon1/4)$ approximate second order stationary points.
arXiv Detail & Related papers (2023-03-19T08:29:49Z) - Clipped Stochastic Methods for Variational Inequalities with
Heavy-Tailed Noise [64.85879194013407]
We prove the first high-probability results with logarithmic dependence on the confidence level for methods for solving monotone and structured non-monotone VIPs.
Our results match the best-known ones in the light-tails case and are novel for structured non-monotone problems.
In addition, we numerically validate that the gradient noise of many practical formulations is heavy-tailed and show that clipping improves the performance of SEG/SGDA.
arXiv Detail & Related papers (2022-06-02T15:21:55Z) - Learning to Estimate Without Bias [57.82628598276623]
Gauss theorem states that the weighted least squares estimator is a linear minimum variance unbiased estimation (MVUE) in linear models.
In this paper, we take a first step towards extending this result to non linear settings via deep learning with bias constraints.
A second motivation to BCE is in applications where multiple estimates of the same unknown are averaged for improved performance.
arXiv Detail & Related papers (2021-10-24T10:23:51Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Accelerating MCMC algorithms through Bayesian Deep Networks [7.054093620465401]
Markov Chain Monte Carlo (MCMC) algorithms are commonly used for their versatility in sampling from complicated probability distributions.
As the dimension of the distribution gets larger, the computational costs for a satisfactory exploration of the sampling space become challenging.
We show an alternative way of performing adaptive MCMC, by using the outcome of Bayesian Neural Networks as the initial proposal for the Markov Chain.
arXiv Detail & Related papers (2020-11-29T04:29:00Z) - Amortized Conditional Normalized Maximum Likelihood: Reliable Out of
Distribution Uncertainty Estimation [99.92568326314667]
We propose the amortized conditional normalized maximum likelihood (ACNML) method as a scalable general-purpose approach for uncertainty estimation.
Our algorithm builds on the conditional normalized maximum likelihood (CNML) coding scheme, which has minimax optimal properties according to the minimum description length principle.
We demonstrate that ACNML compares favorably to a number of prior techniques for uncertainty estimation in terms of calibration on out-of-distribution inputs.
arXiv Detail & Related papers (2020-11-05T08:04:34Z) - Non-convex Learning via Replica Exchange Stochastic Gradient MCMC [25.47669573608621]
We propose an adaptive replica exchange SGMCMC (reSGMCMC) to automatically correct the bias and study the corresponding properties.
Empirically, we test the algorithm through extensive experiments on various setups and obtain the results.
arXiv Detail & Related papers (2020-08-12T15:02:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.