Related papers: Polynomial time sampling from log-smooth distributions in fixed dimension under semi-log-concavity of the forward diffusion with application to strongly dissipative distributions

Polynomial time sampling from log-smooth distributions in fixed dimension under semi-log-concavity of the forward diffusion with application to strongly dissipative distributions

URL: http://arxiv.org/abs/2501.00565v2
Date: Mon, 27 Jan 2025 10:38:12 GMT
Title: Polynomial time sampling from log-smooth distributions in fixed dimension under semi-log-concavity of the forward diffusion with application to strongly dissipative distributions
Authors: Adrien Vacher, Omar Chehab, Anna Korba,
Abstract summary: We provide a sampling algorithm with complexity in fixed dimension.<n>We prove that our algorithm achieves an expected $epsilon$ error in $KL$ divergence.<n>As an application, we derive an exponential complexity improvement for the problem of sampling from an $L$-log-smooth distribution.
Score: 9.48556659249574
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: In this article, we provide a stochastic sampling algorithm with polynomial complexity in fixed dimension that leverages the recent advances on diffusion models where it is shown that under mild conditions, sampling can be achieved via an accurate estimation of intermediate scores across the marginals $(p_t)_{t\ge 0}$ of the standard Ornstein-Uhlenbeck process started at $\mu$, the density we wish to sample from. The heart of our method consists into approaching these scores via a computationally cheap estimator and relating the variance of this estimator to the smoothness properties of the forward process. Under the assumption that the density to sample from is $L$-log-smooth and that the forward process is semi-log-concave: $-\nabla^2 \log(p_t) \succeq -\beta I_d$ for some $\beta \geq 0$, we prove that our algorithm achieves an expected $\epsilon$ error in $\text{KL}$ divergence in $O(d^7(L+\beta)^2L^{d+2}\epsilon^{-2(d+3)}(d+m_2(\mu))^{2(d+1)})$ time with $m_2(\mu)$ the second order moment of $\mu$. In particular, our result allows to fully transfer the problem of sampling from a log-smooth distribution into a regularity estimate problem. As an application, we derive an exponential complexity improvement for the problem of sampling from an $L$-log-smooth distribution that is $\alpha$-strongly log-concave outside some ball of radius $R$: after proving that such distributions verify the semi-log-concavity assumption, a result which might be of independent interest, we recover a $poly(R, L, \alpha^{-1}, \epsilon^{-1})$ complexity in fixed dimension which exponentially improves upon the previously known $poly(e^{LR^2}, L,\alpha^{-1}, \log(\epsilon^{-1}))$ complexity in the low precision regime.

Related papers

Improved sampling algorithms and Poincaré inequalities for non-log-concave distributions [1.9753732769115382]
We show that the query complexity of sampling from $mu$ can be $mathrmpoly(L,d)cdot left(fracLMdepsilon2right)mathcalO(L+1)$, which is in $d$ and $frac1epsilon$ when $L=mathcalO(1)$ and $M=mathrmpoly(d)$.
arXiv Detail & Related papers (2025-07-15T12:06:11Z)
On the query complexity of sampling from non-log-concave distributions [2.4253233571593547]
We study the problem of sampling from a $d$-dimensional distribution with density $p(x)propto e-f(x)$, which does not necessarily satisfy good isoperimetric conditions. We show that for a wide range of parameters, sampling is strictly easier than optimization by a super-exponential factor in the dimension $d$.
arXiv Detail & Related papers (2025-02-10T06:54:16Z)
On the sampling complexity of coherent superpositions [0.4972323953932129]
We consider the problem of sampling from the distribution of measurement outcomes when applying a POVM to a superposition. We give an algorithm which $-$ given $O(chi |c|2 log1/delta)$ such samples and calls to oracles to evaluate the probability density functions.
arXiv Detail & Related papers (2025-01-28T16:56:49Z)
Efficiently learning and sampling multimodal distributions with data-based initialization [20.575122468674536]
We consider the problem of sampling a multimodal distribution with a Markov chain given a small number of samples from the stationary measure. We show that if the Markov chain has a $k$th order spectral gap, samples from the stationary distribution will efficiently generate a sample whose conditional law is $varepsilon$-close in TV distance to the stationary measure.
arXiv Detail & Related papers (2024-11-14T01:37:02Z)
Faster Diffusion Sampling with Randomized Midpoints: Sequential and Parallel [10.840582511203024]
We show that our algorithm can be parallelized to run in only $widetilde O(log2 d)$ parallel rounds. We also show that our algorithm can be parallelized to run in only $widetilde O(log2 d)$ parallel rounds.
arXiv Detail & Related papers (2024-06-03T01:34:34Z)
Projection by Convolution: Optimal Sample Complexity for Reinforcement Learning in Continuous-Space MDPs [56.237917407785545]
We consider the problem of learning an $varepsilon$-optimal policy in a general class of continuous-space Markov decision processes (MDPs) having smooth Bellman operators. Key to our solution is a novel projection technique based on ideas from harmonic analysis. Our result bridges the gap between two popular but conflicting perspectives on continuous-space MDPs.
arXiv Detail & Related papers (2024-05-10T09:58:47Z)
Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix. It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$ Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z)
Stochastic Approximation Approaches to Group Distributionally Robust Optimization and Beyond [89.72693227960274]
This paper investigates group distributionally robust optimization (GDRO) with the goal of learning a model that performs well over $m$ different distributions. To reduce the number of samples in each round from $m$ to 1, we cast GDRO as a two-player game, where one player conducts and the other executes an online algorithm for non-oblivious multi-armed bandits. In the second scenario, we propose to optimize the average top-$k$ risk instead of the maximum risk, thereby mitigating the impact of distributions.
arXiv Detail & Related papers (2023-02-18T09:24:15Z)
Near Sample-Optimal Reduction-based Policy Learning for Average Reward MDP [58.13930707612128]
This work considers the sample complexity of obtaining an $varepsilon$-optimal policy in an average reward Markov Decision Process (AMDP) We prove an upper bound of $widetilde O(H varepsilon-3 ln frac1delta)$ samples per state-action pair, where $H := sp(h*)$ is the span of bias of any optimal policy, $varepsilon$ is the accuracy and $delta$ is the failure probability.
arXiv Detail & Related papers (2022-12-01T15:57:58Z)
Optimal Sublinear Sampling of Spanning Trees and Determinantal Point Processes via Average-Case Entropic Independence [3.9586758145580014]
We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions. For a graph $G=(V, E)$, we show how to approximately sample uniformly random spanning trees from $G$ in $widetildeO(lvert Vrvert)$ time per sample. For a determinantal point process on subsets of size $k$ of a ground set of $n$ elements, we show how to approximately sample in $widetildeO(komega)$ time after an initial $widetildeO(nk
arXiv Detail & Related papers (2022-04-06T04:11:26Z)
Random quantum circuits transform local noise into global white noise [118.18170052022323]
We study the distribution over measurement outcomes of noisy random quantum circuits in the low-fidelity regime. For local noise that is sufficiently weak and unital, correlations (measured by the linear cross-entropy benchmark) between the output distribution $p_textnoisy$ of a generic noisy circuit instance shrink exponentially. If the noise is incoherent, the output distribution approaches the uniform distribution $p_textunif$ at precisely the same rate.
arXiv Detail & Related papers (2021-11-29T19:26:28Z)
Wasserstein distance estimates for the distributions of numerical approximations to ergodic stochastic differential equations [0.3553493344868413]
We study the Wasserstein distance between the in distribution of an ergodic differential equation and the distribution in the strongly log-concave case. This allows us to study in a unified way a number of different approximations proposed in the literature for the overdamped and underdamped Langevin dynamics.
arXiv Detail & Related papers (2021-04-26T07:50:04Z)
Complexity of zigzag sampling algorithm for strongly log-concave distributions [6.336005544376984]
We study the computational complexity of zigzag sampling algorithm for strongly log-concave distributions. We prove that the zigzag sampling algorithm achieves $varepsilon$ error in chi-square divergence with a computational cost equivalent to $Obigl.
arXiv Detail & Related papers (2020-12-21T03:10:21Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP) We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.