Structured Logconcave Sampling with a Restricted Gaussian Oracle
- URL: http://arxiv.org/abs/2010.03106v4
- Date: Fri, 22 Oct 2021 06:25:01 GMT
- Title: Structured Logconcave Sampling with a Restricted Gaussian Oracle
- Authors: Yin Tat Lee, Ruoqi Shen, Kevin Tian
- Abstract summary: We give algorithms for sampling several structured logconcave families to high accuracy.
We further develop a reduction framework, inspired by proximal point methods in convex optimization.
- Score: 23.781520510778716
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give algorithms for sampling several structured logconcave families to
high accuracy. We further develop a reduction framework, inspired by proximal
point methods in convex optimization, which bootstraps samplers for regularized
densities to improve dependences on problem conditioning. A key ingredient in
our framework is the notion of a "restricted Gaussian oracle" (RGO) for $g:
\mathbb{R}^d \rightarrow \mathbb{R}$, which is a sampler for distributions
whose negative log-likelihood sums a quadratic and $g$. By combining our
reduction framework with our new samplers, we obtain the following bounds for
sampling structured distributions to total variation distance $\epsilon$. For
composite densities $\exp(-f(x) - g(x))$, where $f$ has condition number
$\kappa$ and convex (but possibly non-smooth) $g$ admits an RGO, we obtain a
mixing time of $O(\kappa d \log^3\frac{\kappa d}{\epsilon})$, matching the
state-of-the-art non-composite bound; no composite samplers with better mixing
than general-purpose logconcave samplers were previously known. For logconcave
finite sums $\exp(-F(x))$, where $F(x) = \frac{1}{n}\sum_{i \in [n]} f_i(x)$
has condition number $\kappa$, we give a sampler querying $\widetilde{O}(n +
\kappa\max(d, \sqrt{nd}))$ gradient oracles to $\{f_i\}_{i \in [n]}$; no
high-accuracy samplers with nontrivial gradient query complexity were
previously known. For densities with condition number $\kappa$, we give an
algorithm obtaining mixing time $O(\kappa d \log^2\frac{\kappa d}{\epsilon})$,
improving the prior state-of-the-art by a logarithmic factor with a
significantly simpler analysis; we also show a zeroth-order algorithm attains
the same query complexity.
Related papers
- Rényi-infinity constrained sampling with $d^3$ membership queries [2.209921757303168]
We propose a constrained proximal sampler, a principled and simple algorithm that possesses elegant convergence guarantees.
We show that it converges in the R'enyi-infinity divergence ($mathcal R_infty$) with no query complexity overhead when starting from a warm start.
arXiv Detail & Related papers (2024-07-17T19:20:08Z) - A Unified Framework for Uniform Signal Recovery in Nonlinear Generative
Compressed Sensing [68.80803866919123]
Under nonlinear measurements, most prior results are non-uniform, i.e., they hold with high probability for a fixed $mathbfx*$ rather than for all $mathbfx*$ simultaneously.
Our framework accommodates GCS with 1-bit/uniformly quantized observations and single index models as canonical examples.
We also develop a concentration inequality that produces tighter bounds for product processes whose index sets have low metric entropy.
arXiv Detail & Related papers (2023-09-25T17:54:19Z) - Identification of Mixtures of Discrete Product Distributions in
Near-Optimal Sample and Time Complexity [6.812247730094931]
We show, for any $ngeq 2k-1$, how to achieve sample complexity and run-time complexity $(1/zeta)O(k)$.
We also extend the known lower bound of $eOmega(k)$ to match our upper bound across a broad range of $zeta$.
arXiv Detail & Related papers (2023-09-25T09:50:15Z) - Sample Complexity Bounds for Learning High-dimensional Simplices in
Noisy Regimes [5.526935605535376]
We find a sample complexity bound for learning a simplex from noisy samples.
We show that as long as $mathrmSNRgeOmegaleft(K1/2right)$, the sample complexity of the noisy regime has the same order to that of the noiseless case.
arXiv Detail & Related papers (2022-09-09T23:35:25Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - Fast Graph Sampling for Short Video Summarization using Gershgorin Disc
Alignment [52.577757919003844]
We study the problem of efficiently summarizing a short video into several paragraphs, leveraging recent progress in fast graph sampling.
Experimental results show that our algorithm achieves comparable video summarization as state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2021-10-21T18:43:00Z) - Hybrid Stochastic-Deterministic Minibatch Proximal Gradient:
Less-Than-Single-Pass Optimization with Nearly Optimal Generalization [83.80460802169999]
We show that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
For loss factors, we prove that HSDMPG can attain an $mathcalObig (1/sttnbig)$ which is at the order of excess error on a learning model.
arXiv Detail & Related papers (2020-09-18T02:18:44Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Composite Logconcave Sampling with a Restricted Gaussian Oracle [23.781520510778716]
We consider sampling from composite densities on $mathbbRd$ of the form $dpi(x) propto exp(-f(x) - g(x)dx)$ for well-conditioned $f$ and convex (but possibly non-smooth) $g$.
For $f$ with condition number $kappa$, our algorithm runs in $O left(kappa2 d log2tfrackappa depsilonright)$, each querying a gradient of $f$
arXiv Detail & Related papers (2020-06-10T17:43:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.