A mixing time bound for Gibbs sampling from log-smooth log-concave distributions
- URL: http://arxiv.org/abs/2412.17899v1
- Date: Mon, 23 Dec 2024 19:00:02 GMT
- Title: A mixing time bound for Gibbs sampling from log-smooth log-concave distributions
- Authors: Neha S. Wadia,
- Abstract summary: We study the behavior of the Gibbs sampler on the class of log-smooth and strongly log-concave target distributions.
We show that the Gibbs sampler requires at most $Ostarleft(kappa2 n7.5left(maxleft1,sqrtfrac1nlog frac2Mgammarightright2right)$ steps to produce a sample with error no more than $gamma$ in total variation distance from a distribution with condition number $kappa$.
- Score: 0.8702432681310401
- License:
- Abstract: The Gibbs sampler, also known as the coordinate hit-and-run algorithm, is a Markov chain that is widely used to draw samples from probability distributions in arbitrary dimensions. At each iteration of the algorithm, a randomly selected coordinate is resampled from the distribution that results from conditioning on all the other coordinates. We study the behavior of the Gibbs sampler on the class of log-smooth and strongly log-concave target distributions supported on $\mathbb{R}^n$. Assuming the initial distribution is $M$-warm with respect to the target, we show that the Gibbs sampler requires at most $O^{\star}\left(\kappa^2 n^{7.5}\left(\max\left\{1,\sqrt{\frac{1}{n}\log \frac{2M}{\gamma}}\right\}\right)^2\right)$ steps to produce a sample with error no more than $\gamma$ in total variation distance from a distribution with condition number $\kappa$.
Related papers
- Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models [65.71506381302815]
We propose amortize the cost of sampling from a posterior distribution of the form $p(mathbfxmidmathbfy) propto p_theta(mathbfx)$.
For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference.
arXiv Detail & Related papers (2025-02-10T19:49:54Z) - Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Entropy contraction of the Gibbs sampler under log-concavity [0.16385815610837165]
We show that the random scan Gibbs sampler contracts in relative entropy and provide a sharp characterization of the associated contraction rate.
Our techniques are versatile and extend to Metropolis-within-Gibbs schemes and the Hit-and-Run algorithm.
arXiv Detail & Related papers (2024-10-01T16:50:36Z) - Simple and Nearly-Optimal Sampling for Rank-1 Tensor Completion via Gauss-Jordan [49.1574468325115]
We revisit the sample and computational complexity of completing a rank-1 tensor in $otimes_i=1N mathbbRd$.
We present a characterization of the problem which admits an algorithm amounting to Gauss-Jordan on a pair of random linear systems.
arXiv Detail & Related papers (2024-08-10T04:26:19Z) - Fast parallel sampling under isoperimetry [10.619901778151336]
We show how to sample in parallel from a distribution $pi$ over $mathbb Rd$ that satisfies a log-Sobolev inequality.
We show that our algorithm outputs samples from a distribution $hatpi$ that is close to $pi$ in Kullback--Leibler divergence.
arXiv Detail & Related papers (2024-01-17T07:24:18Z) - Robust Mean Estimation Without Moments for Symmetric Distributions [7.105512316884493]
We show that for a large class of symmetric distributions, the same error as in the Gaussian setting can be achieved efficiently.
We propose a sequence of efficient algorithms that approaches this optimal error.
Our algorithms are based on a generalization of the well-known filtering technique.
arXiv Detail & Related papers (2023-02-21T17:52:23Z) - Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions.
To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits.
In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Perfect Sampling from Pairwise Comparisons [26.396901523831534]
We study how to efficiently obtain perfect samples from a discrete distribution $mathcalD$ given access only to pairwise comparisons of elements of its support.
We design a Markov chain whose stationary distribution coincides with $mathcalD$ and give an algorithm to obtain exact samples using the technique of Coupling from the Past.
arXiv Detail & Related papers (2022-11-23T11:20:30Z) - An MCMC Method to Sample from Lattice Distributions [4.4044968357361745]
We introduce a Markov Chain Monte Carlo algorithm to generate samples from probability distributions supported on a $d$-dimensional lattice $Lambda.
We use $pi$ as the proposal distribution and calculate the Metropolis-Hastings acceptance ratio for a well-chosen target distribution.
arXiv Detail & Related papers (2021-01-16T15:01:53Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.