Improved Bound for Mixing Time of Parallel Tempering
- URL: http://arxiv.org/abs/2304.01303v1
- Date: Mon, 3 Apr 2023 19:03:22 GMT
- Title: Improved Bound for Mixing Time of Parallel Tempering
- Authors: Holden Lee, Zeyu Shen
- Abstract summary: We present a new lower bound for parallel tempering on the spectral gap that has a multimodal dependence on all parameters except $log L$.
This improves the best existing bound which depends exponentially on the number of modes.
We complement our result with a hypothetical upper bound on spectral gap that has an exponential dependence on $log L$, which shows that, in some sense, our bound is tight.
- Score: 4.68299658663016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the field of sampling algorithms, MCMC (Markov Chain Monte Carlo) methods
are widely used when direct sampling is not possible. However, multimodality of
target distributions often leads to slow convergence and mixing. One common
solution is parallel tempering. Though highly effective in practice,
theoretical guarantees on its performance are limited. In this paper, we
present a new lower bound for parallel tempering on the spectral gap that has a
polynomial dependence on all parameters except $\log L$, where $(L + 1)$ is the
number of levels. This improves the best existing bound which depends
exponentially on the number of modes. Moreover, we complement our result with a
hypothetical upper bound on spectral gap that has an exponential dependence on
$\log L$, which shows that, in some sense, our bound is tight.
Related papers
- A Fast, Robust Elliptical Slice Sampling Implementation for Linearly Truncated Multivariate Normal Distributions [12.800664845601197]
slice sampling is a rejection-free Markov chain Monte Carlo method.
The main novelty of this paper is an algorithm that computes an intersection in $mathcalO(m log m)$ time.
We show that an implementation based on this algorithm enhances numerical stability, speeds up running time, and is easy to parallelize for launching multiple Markov chains.
arXiv Detail & Related papers (2024-07-15T05:40:11Z) - Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel [10.840582511203024]
We show that our algorithm can be parallelized to run in only $widetilde O(log2 d)$ parallel rounds.
We also show that our algorithm can be parallelized to run in only $widetilde O(log2 d)$ parallel rounds.
arXiv Detail & Related papers (2024-06-03T01:34:34Z) - A New Random Reshuffling Method for Nonsmooth Nonconvex Finite-sum Optimization [6.314057999212246]
Random reshuffling techniques are used in large-scale applications, such as neural networks.
In this paper, we show that the random reshuffling-type iterations generated by norm-PRR converge to a linear setting.
Finally, we derive last convergence rates that can be applied to the proposed approach.
arXiv Detail & Related papers (2023-12-02T07:12:00Z) - Distributed Extra-gradient with Optimal Complexity and Communication
Guarantees [60.571030754252824]
We consider monotone variational inequality (VI) problems in multi-GPU settings where multiple processors/workers/clients have access to local dual vectors.
Extra-gradient, which is a de facto algorithm for monotone VI problems, has not been designed to be communication-efficient.
We propose a quantized generalized extra-gradient (Q-GenX), which is an unbiased and adaptive compression method tailored to solve VIs.
arXiv Detail & Related papers (2023-08-17T21:15:04Z) - Explicit Second-Order Min-Max Optimization Methods with Optimal Convergence Guarantee [86.05440220344755]
We propose and analyze inexact regularized Newton-type methods for finding a global saddle point of emphcon unconstrained min-max optimization problems.
We show that the proposed methods generate iterates that remain within a bounded set and that the iterations converge to an $epsilon$-saddle point within $O(epsilon-2/3)$ in terms of a restricted function.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - A Law of Iterated Logarithm for Multi-Agent Reinforcement Learning [3.655021726150368]
In Multi-Agent Reinforcement Learning (MARL), multiple agents interact with a common environment, as also with each other, for solving a shared problem in sequential decision-making.
We derive a novel law of iterated for a family of distributed nonlinear approximation schemes that is useful in MARL.
arXiv Detail & Related papers (2021-10-27T08:01:17Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Lower Bounds on Metropolized Sampling Methods for Well-Conditioned
Distributions [29.314919044203926]
We give lower bounds on the performance of two of the most popular sampling methods in practice, the Langevin algorithm (MALA) and multi-step Hamiltonian Monte Carlo (HMC)
Our main result is a nearly-tight lower bound of $widetildeOmega(kappa d)$ on the mixing time of MALA from an exponentially warm start.
arXiv Detail & Related papers (2021-06-10T03:47:39Z) - Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples.
We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z) - SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm [64.13217062232874]
SURF is an algorithm for approximating distributions by piecewises.
It outperforms state-of-the-art algorithms in experiments.
arXiv Detail & Related papers (2020-02-22T01:03:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.