Oracle Lower Bounds for Stochastic Gradient Sampling Algorithms
- URL: http://arxiv.org/abs/2002.00291v3
- Date: Sat, 3 Jul 2021 04:12:14 GMT
- Title: Oracle Lower Bounds for Stochastic Gradient Sampling Algorithms
- Authors: Niladri S. Chatterji, Peter L. Bartlett, Philip M. Long
- Abstract summary: We consider the problem of sampling from a strongly log-concave density in $bbRd$.
We prove an information theoretic lower bound on the number of gradient queries of the log density needed.
- Score: 39.746670539407084
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the problem of sampling from a strongly log-concave density in
$\mathbb{R}^d$, and prove an information theoretic lower bound on the number of
stochastic gradient queries of the log density needed. Several popular sampling
algorithms (including many Markov chain Monte Carlo methods) operate by using
stochastic gradients of the log density to generate a sample; our results
establish an information theoretic limit for all these algorithms.
We show that for every algorithm, there exists a well-conditioned strongly
log-concave target density for which the distribution of points generated by
the algorithm would be at least $\varepsilon$ away from the target in total
variation distance if the number of gradient queries is less than
$\Omega(\sigma^2 d/\varepsilon^2)$, where $\sigma^2 d$ is the variance of the
stochastic gradient. Our lower bound follows by combining the ideas of Le Cam
deficiency routinely used in the comparison of statistical experiments along
with standard information theoretic tools used in lower bounding Bayes risk
functions. To the best of our knowledge our results provide the first
nontrivial dimension-dependent lower bound for this problem.
Related papers
- Generalization Bounds for Label Noise Stochastic Gradient Descent [0.0]
We generalization error bounds for gradient descent (SGD) with label noise in non-metric conditions.
Our analysis offers insights into the effect of label noise.
arXiv Detail & Related papers (2023-11-01T03:51:46Z) - Robust Stochastic Optimization via Gradient Quantile Clipping [6.2844649973308835]
We introduce a quant clipping strategy for Gradient Descent (SGD)
We use gradient new outliers as norm clipping chains.
We propose an implementation of the algorithm using Huberiles.
arXiv Detail & Related papers (2023-09-29T15:24:48Z) - Sobolev Space Regularised Pre Density Models [51.558848491038916]
We propose a new approach to non-parametric density estimation that is based on regularizing a Sobolev norm of the density.
This method is statistically consistent, and makes the inductive validation model clear and consistent.
arXiv Detail & Related papers (2023-07-25T18:47:53Z) - Kinetic Langevin MCMC Sampling Without Gradient Lipschitz Continuity --
the Strongly Convex Case [0.0]
We consider sampling from log concave distributions in Hamiltonian setting, without assuming that the objective is globally Lipschitz.
We propose two algorithms based on polygonal gradient (tamed) Euler schemes, to sample from a target measure, and provide non-asymptotic 2-Wasserstein distance bounds between the law of the process of each algorithm and the target measure.
arXiv Detail & Related papers (2023-01-19T12:32:41Z) - Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers.
Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z) - Differentiable Annealed Importance Sampling and the Perils of Gradient
Noise [68.44523807580438]
Annealed importance sampling (AIS) and related algorithms are highly effective tools for marginal likelihood estimation.
Differentiability is a desirable property as it would admit the possibility of optimizing marginal likelihood as an objective.
We propose a differentiable algorithm by abandoning Metropolis-Hastings steps, which further unlocks mini-batch computation.
arXiv Detail & Related papers (2021-07-21T17:10:14Z) - Projected Stochastic Gradient Langevin Algorithms for Constrained
Sampling and Non-Convex Learning [0.0]
Langevin algorithms are methods with additive noise.
Langevin algorithms have been used for decades in chain Carlo (Milon)
For learning, we show that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that it is that
arXiv Detail & Related papers (2020-12-22T16:19:20Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Carath\'eodory Sampling for Stochastic Gradient Descent [79.55586575988292]
We present an approach that is inspired by classical results of Tchakaloff and Carath'eodory about measure reduction.
We adaptively select the descent steps where the measure reduction is carried out.
We combine this with Block Coordinate Descent so that measure reduction can be done very cheaply.
arXiv Detail & Related papers (2020-06-02T17:52:59Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.