Improved dimension dependence of a proximal algorithm for sampling
- URL: http://arxiv.org/abs/2302.10081v2
- Date: Wed, 28 Jun 2023 16:09:48 GMT
- Title: Improved dimension dependence of a proximal algorithm for sampling
- Authors: Jiaojiao Fan, Bo Yuan and Yongxin Chen
- Abstract summary: We propose a sampling algorithm that achieves superior complexity bounds in all the classical settings.
Our algorithm is based on the proximal sampler introduced incitetlee 2021.
For strongly log-concave distributions, our method has complexity bound $tildemathcalO(kappa d1/2)$ without warm start.
For distributions satisfying the LSI, our bound is $tilde mathcalO(hat kappa d1/2)$ where $hat kappa$ is the ratio between smoothness and
- Score: 16.147290924171692
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a sampling algorithm that achieves superior complexity bounds in
all the classical settings (strongly log-concave, log-concave,
Logarithmic-Sobolev inequality (LSI), Poincar\'e inequality) as well as more
general settings with semi-smooth or composite potentials. Our algorithm is
based on the proximal sampler introduced in~\citet{lee2021structured}. The
performance of this proximal sampler is determined by that of the restricted
Gaussian oracle (RGO), a key step in the proximal sampler. The main
contribution of this work is an inexact realization of RGO based on approximate
rejection sampling. To bound the inexactness of RGO, we establish a new
concentration inequality for semi-smooth functions over Gaussian distributions,
extending the well-known concentration inequality for Lipschitz functions.
Applying our RGO implementation to the proximal sampler, we achieve
state-of-the-art complexity bounds in almost all settings. For instance, for
strongly log-concave distributions, our method has complexity bound
$\tilde\mathcal{O}(\kappa d^{1/2})$ without warm start, better than the minimax
bound for MALA. For distributions satisfying the LSI, our bound is $\tilde
\mathcal{O}(\hat \kappa d^{1/2})$ where $\hat \kappa$ is the ratio between
smoothness and the LSI constant, better than all existing bounds.
Related papers
- Faster Sampling via Stochastic Gradient Proximal Sampler [28.422547264326468]
Proximal samplers (SPS) for sampling from non-log-concave distributions are studied.
We show that the convergence to the target distribution can be guaranteed as long as the algorithm trajectory is bounded.
We provide two implementable variants based on Langevin dynamics (SGLD) and Langevin-MALA, giving rise to SPS-SGLD and SPS-MALA.
arXiv Detail & Related papers (2024-05-27T00:53:18Z) - Sample Complexity for Quadratic Bandits: Hessian Dependent Bounds and
Optimal Algorithms [64.10576998630981]
We show the first tight characterization of the optimal Hessian-dependent sample complexity.
A Hessian-independent algorithm universally achieves the optimal sample complexities for all Hessian instances.
The optimal sample complexities achieved by our algorithm remain valid for heavy-tailed noise distributions.
arXiv Detail & Related papers (2023-06-21T17:03:22Z) - Faster high-accuracy log-concave sampling via algorithmic warm starts [6.117084972237769]
In practice, high-accuracy samplers such as the classical Metropolis-adjusted Langevin algorithm (MALA) remain the de facto gold standard.
We improve the dimension dependence of this sampling problem to $tildeO(d1/2)$, whereas the previous best result for MALA was $tildeO(d)$.
Our main technical contribution settles this problem by establishing the first $tildeO(d1/2)$ R'enyi mixing rates for the discretized underdamped Langevin diffusion.
arXiv Detail & Related papers (2023-02-20T19:27:21Z) - Optimal Algorithms for Stochastic Complementary Composite Minimization [55.26935605535377]
Inspired by regularization techniques in statistics and machine learning, we study complementary composite minimization.
We provide novel excess risk bounds, both in expectation and with high probability.
Our algorithms are nearly optimal, which we prove via novel lower complexity bounds for this class of problems.
arXiv Detail & Related papers (2022-11-03T12:40:24Z) - A Proximal Algorithm for Sampling from Non-convex Potentials [14.909442791255042]
We consider problems with non-smooth potentials that lack smoothness.
Rather than smooth, the potentials are assumed to be semi-smooth or multiple multiplesmooth functions.
We develop a special case Gibbs sampling known as the alternating sampling framework.
arXiv Detail & Related papers (2022-05-20T13:58:46Z) - Minibatch vs Local SGD with Shuffling: Tight Convergence Bounds and
Beyond [63.59034509960994]
We study shuffling-based variants: minibatch and local Random Reshuffling, which draw gradients without replacement.
For smooth functions satisfying the Polyak-Lojasiewicz condition, we obtain convergence bounds which show that these shuffling-based variants converge faster than their with-replacement counterparts.
We propose an algorithmic modification called synchronized shuffling that leads to convergence rates faster than our lower bounds in near-homogeneous settings.
arXiv Detail & Related papers (2021-10-20T02:25:25Z) - Sample-Optimal PAC Learning of Halfspaces with Malicious Noise [4.8728183994912415]
We study efficient PAC learning of halfspaces in $mathRd$ in the presence of malicious noise of Valiant(1985)
We present a new analysis for the algorithm of Awasthi et al.( 2017) and show that it essentially achieves the near-optimal sample complexity bound of $tildeO(d)$.
We extend the algorithm and analysis to the more general and stronger nasty noise model of Bbbshoutyetal (2002), showing that it is still possible to achieve near-optimal noise tolerance and sample complexity in time.
arXiv Detail & Related papers (2021-02-11T20:18:20Z) - Structured Logconcave Sampling with a Restricted Gaussian Oracle [23.781520510778716]
We give algorithms for sampling several structured logconcave families to high accuracy.
We further develop a reduction framework, inspired by proximal point methods in convex optimization.
arXiv Detail & Related papers (2020-10-07T01:43:07Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z) - Breaking the Sample Size Barrier in Model-Based Reinforcement Learning
with a Generative Model [50.38446482252857]
This paper is concerned with the sample efficiency of reinforcement learning, assuming access to a generative model (or simulator)
We first consider $gamma$-discounted infinite-horizon Markov decision processes (MDPs) with state space $mathcalS$ and action space $mathcalA$.
We prove that a plain model-based planning algorithm suffices to achieve minimax-optimal sample complexity given any target accuracy level.
arXiv Detail & Related papers (2020-05-26T17:53:18Z) - Computationally efficient sparse clustering [67.95910835079825]
We provide a finite sample analysis of a new clustering algorithm based on PCA.
We show that it achieves the minimax optimal misclustering rate in the regime $|theta infty$.
arXiv Detail & Related papers (2020-05-21T17:51:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.