Optimal design of the Barker proposal and other locally-balanced
Metropolis-Hastings algorithms
- URL: http://arxiv.org/abs/2201.01123v1
- Date: Tue, 4 Jan 2022 13:06:50 GMT
- Title: Optimal design of the Barker proposal and other locally-balanced
Metropolis-Hastings algorithms
- Authors: Jure Vogrinc, Samuel Livingstone and Giacomo Zanella
- Abstract summary: We study the class of first-order locally-balanced Metropolis--Hastings algorithms introduced in Livingstone & Zanella ( 2021 )
To choose a specific algorithm within the class the user must select a balancing function $g:mathbbR to mathbbR$ satisfying $g(t) = tg (1/t)$, and a noise distribution for the proposal.
We derive an optimal choice of noise distribution for the Barker proposal, optimal choice of balancing function under a Gaussian noise distribution, and optimal choice of first-order locally-balanced algorithm among
- Score: 0.2148535041822524
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the class of first-order locally-balanced Metropolis--Hastings
algorithms introduced in Livingstone & Zanella (2021). To choose a specific
algorithm within the class the user must select a balancing function
$g:\mathbb{R} \to \mathbb{R}$ satisfying $g(t) = tg(1/t)$, and a noise
distribution for the proposal increment. Popular choices within the class are
the Metropolis-adjusted Langevin algorithm and the recently introduced Barker
proposal. We first establish a universal limiting optimal acceptance rate of
57% and scaling of $n^{-1/3}$ as the dimension $n$ tends to infinity among all
members of the class under mild smoothness assumptions on $g$ and when the
target distribution for the algorithm is of the product form. In particular we
obtain an explicit expression for the asymptotic efficiency of an arbitrary
algorithm in the class, as measured by expected squared jumping distance. We
then consider how to optimise this expression under various constraints. We
derive an optimal choice of noise distribution for the Barker proposal, optimal
choice of balancing function under a Gaussian noise distribution, and optimal
choice of first-order locally-balanced algorithm among the entire class, which
turns out to depend on the specific target distribution. Numerical simulations
confirm our theoretical findings and in particular show that a bi-modal choice
of noise distribution in the Barker proposal gives rise to a practical
algorithm that is consistently more efficient than the original Gaussian
version.
Related papers
- Semi-Bandit Learning for Monotone Stochastic Optimization [20.776114616154242]
We provide a generic online learning algorithm for a class of "monotone" problems.
Our framework applies to several fundamental problems in optimization such as prophet, Pandora's box knapsack, inequality matchings and submodular optimization.
arXiv Detail & Related papers (2023-12-24T07:46:37Z) - Optimality in Mean Estimation: Beyond Worst-Case, Beyond Sub-Gaussian,
and Beyond $1+\alpha$ Moments [10.889739958035536]
We introduce a new definitional framework to analyze the fine-grained optimality of algorithms.
We show that median-of-means is neighborhood optimal, up to constant factors.
It is open to find a neighborhood-separated estimator without constant factor slackness.
arXiv Detail & Related papers (2023-11-21T18:50:38Z) - Gradient-free optimization of highly smooth functions: improved analysis
and a new algorithm [87.22224691317766]
This work studies problems with zero-order noisy oracle information under the assumption that the objective function is highly smooth.
We consider two kinds of zero-order projected gradient descent algorithms.
arXiv Detail & Related papers (2023-06-03T17:05:13Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Optimal Scaling for Locally Balanced Proposals in Discrete Spaces [65.14092237705476]
We show that efficiency of Metropolis-Hastings (M-H) algorithms in discrete spaces can be characterized by an acceptance rate that is independent of the target distribution.
Knowledge of the optimal acceptance rate allows one to automatically tune the neighborhood size of a proposal distribution in a discrete space, directly analogous to step-size control in continuous spaces.
arXiv Detail & Related papers (2022-09-16T22:09:53Z) - A gradient estimator via L1-randomization for online zero-order
optimization with two point feedback [93.57603470949266]
We present a novel gradient estimator based on two function evaluation and randomization.
We consider two types of assumptions on the noise of the zero-order oracle: canceling noise and adversarial noise.
We provide an anytime and completely data-driven algorithm, which is adaptive to all parameters of the problem.
arXiv Detail & Related papers (2022-05-27T11:23:57Z) - A Convergent and Dimension-Independent Min-Max Optimization Algorithm [32.492526162436405]
We show that a distribution which the min-player uses to update its parameters depends on a smooth and bounded nonnon-concave function.
Our algorithm converges to an approximate local equilibrium in a number of iterations that does not depend on the iterations.
arXiv Detail & Related papers (2020-06-22T16:11:30Z) - Stochastic Saddle-Point Optimization for Wasserstein Barycenters [69.68068088508505]
We consider the populationimation barycenter problem for random probability measures supported on a finite set of points and generated by an online stream of data.
We employ the structure of the problem and obtain a convex-concave saddle-point reformulation of this problem.
In the setting when the distribution of random probability measures is discrete, we propose an optimization algorithm and estimate its complexity.
arXiv Detail & Related papers (2020-06-11T19:40:38Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z) - A New Randomized Primal-Dual Algorithm for Convex Optimization with
Optimal Last Iterate Rates [16.54912614895861]
We develop a novel unified randomized block-coordinate primal-dual algorithm to solve a class of nonsmooth constrained convex optimization problems.
We prove that our algorithm achieves optimal convergence rates in two cases: general convexity and strong convexity.
Our results show that the proposed method has encouraging performance on different experiments.
arXiv Detail & Related papers (2020-03-03T03:59:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.