Spectral gap of Metropolis-within-Gibbs under log-concavity
- URL: http://arxiv.org/abs/2509.26175v1
- Date: Tue, 30 Sep 2025 12:31:22 GMT
- Title: Spectral gap of Metropolis-within-Gibbs under log-concavity
- Authors: Cecilia Secchi, Giacomo Zanella,
- Abstract summary: The Metropolis-within-Gibbs (MwG) algorithm is a widely used Markov Chain Monte Carlo method for sampling from high-dimensional distributions.<n>We study MwG with Random Walk Metropolis (RWM) updates, using proposal variances tuned to match the target's conditional variances.<n>The result shows that MwG can mix substantially faster with variance-adaptive proposals and that its mixing performance is just a constant factor worse than that of the exact Gibbs sampler.
- Score: 1.4466802614938334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Metropolis-within-Gibbs (MwG) algorithm is a widely used Markov Chain Monte Carlo method for sampling from high-dimensional distributions when exact conditional sampling is intractable. We study MwG with Random Walk Metropolis (RWM) updates, using proposal variances tuned to match the target's conditional variances. Assuming the target $\pi$ is a $d$-dimensional log-concave distribution with condition number $\kappa$, we establish a spectral gap lower bound of order $\mathcal{O}(1/\kappa d)$ for the random-scan version of MwG, improving on the previously available $\mathcal{O}(1/\kappa^2 d)$ bound. This is obtained by developing sharp estimates of the conductance of one-dimensional RWM kernels, which can be of independent interest. The result shows that MwG can mix substantially faster with variance-adaptive proposals and that its mixing performance is just a constant factor worse than that of the exact Gibbs sampler, thus providing theoretical support to previously observed empirical behavior.
Related papers
- Optimal Convergence Analysis of DDPM for General Distributions [11.155024379105788]
Denoising Diffusion Probabilistic Model (DDPM) is one of the most widely used samplers.<n>We provide a refined convergence analysis of the DDPM sampler.<n>We show that our convergence analysis is tight for a wide array of target distributions.
arXiv Detail & Related papers (2025-10-31T15:44:50Z) - A mixing time bound for Gibbs sampling from log-smooth log-concave distributions [0.8702432681310401]
We study the behavior of the Gibbs sampler on the class of log-smooth and strongly log-concave target distributions.<n>We show that the Gibbs sampler requires at most $Ostarleft(kappa2 n7.5left(maxleft1,sqrtfrac1nlog frac2Mgammarightright2right)$ steps to produce a sample with error no more than $gamma$ in total variation distance from a distribution with condition number $kappa$.
arXiv Detail & Related papers (2024-12-23T19:00:02Z) - Entropy contraction of the Gibbs sampler under log-concavity [0.16385815610837165]
We show that the random scan Gibbs sampler contracts in relative entropy and provide a sharp characterization of the associated contraction rate.
Our techniques are versatile and extend to Metropolis-within-Gibbs schemes and the Hit-and-Run algorithm.
arXiv Detail & Related papers (2024-10-01T16:50:36Z) - Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling [28.931489333515618]
We establish an oracle complexity of $widetildeOleft(fracdbeta2cal A2varepsilon6right)$ for the simple annealed Langevin Monte Carlo algorithm.<n>We show that $cal A$ represents the action of a curve of probability measures interpolating the target distribution $pi$ and a readily sampleable distribution.
arXiv Detail & Related papers (2024-07-24T02:15:48Z) - Variance Reduction for the Independent Metropolis Sampler [11.074080383657453]
We prove that if $pi$ is close enough under KL divergence to another density $q$, an independent sampler that obtains samples from $pi$ achieves smaller variance than i.i.d. sampling from $pi$.
We propose an adaptive independent Metropolis algorithm that adapts the proposal density such that its KL divergence with the target is being reduced.
arXiv Detail & Related papers (2024-06-25T16:38:53Z) - Adaptive Annealed Importance Sampling with Constant Rate Progress [68.8204255655161]
Annealed Importance Sampling (AIS) synthesizes weighted samples from an intractable distribution.
We propose the Constant Rate AIS algorithm and its efficient implementation for $alpha$-divergences.
arXiv Detail & Related papers (2023-06-27T08:15:28Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - Non-Gaussian Component Analysis via Lattice Basis Reduction [56.98280399449707]
Non-Gaussian Component Analysis (NGCA) is a distribution learning problem.
We provide an efficient algorithm for NGCA in the regime that $A$ is discrete or nearly discrete.
arXiv Detail & Related papers (2021-12-16T18:38:02Z) - Mean-Square Analysis with An Application to Optimal Dimension Dependence
of Langevin Monte Carlo [60.785586069299356]
This work provides a general framework for the non-asymotic analysis of sampling error in 2-Wasserstein distance.
Our theoretical analysis is further validated by numerical experiments.
arXiv Detail & Related papers (2021-09-08T18:00:05Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.