Related papers: Stochastic Langevin Monte Carlo for (weakly) log-concave posterior distributions

Stochastic Langevin Monte Carlo for (weakly) log-concave posterior distributions

URL: http://arxiv.org/abs/2301.03077v1
Date: Sun, 8 Jan 2023 17:08:21 GMT
Title: Stochastic Langevin Monte Carlo for (weakly) log-concave posterior distributions
Authors: Marelys Crespo Navas, S\'ebastien Gadat, Xavier Gendre
Abstract summary: We investigate a continuous time version of the Langevin Monte Carlo method, introduced in [WT11], that incorporates a sampling step inside the traditional over-damped Langevin diffusion. This method is popular in machine learning for sampling posterior distribution.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we investigate a continuous time version of the Stochastic Langevin Monte Carlo method, introduced in [WT11], that incorporates a stochastic sampling step inside the traditional over-damped Langevin diffusion. This method is popular in machine learning for sampling posterior distribution. We will pay specific attention in our work to the computational cost in terms of $n$ (the number of observations that produces the posterior distribution), and $d$ (the dimension of the ambient space where the parameter of interest is living). We derive our analysis in the weakly convex framework, which is parameterized with the help of the Kurdyka-\L ojasiewicz (KL) inequality, that permits to handle a vanishing curvature settings, which is far less restrictive when compared to the simple strongly convex case. We establish that the final horizon of simulation to obtain an $\varepsilon$ approximation (in terms of entropy) is of the order $( d \log(n)^2 )^{(1+r)^2} [\log^2(\varepsilon^{-1}) + n^2 d^{2(1+r)} \log^{4(1+r)}(n) ]$ with a Poissonian subsampling of parameter $\left(n ( d \log^2(n))^{1+r}\right)^{-1}$, where the parameter $r$ is involved in the KL inequality and varies between $0$ (strongly convex case) and $1$ (limiting Laplace situation).

Related papers

Polynomial time sampling from log-smooth distributions in fixed dimension under semi-log-concavity of the forward diffusion with application to strongly dissipative distributions [9.48556659249574]
We provide a sampling algorithm with complexity in fixed dimension. We prove that our algorithm achieves an expected $epsilon$ error in $KL$ divergence. As an application, we derive an exponential complexity improvement for the problem of sampling from an $L$-log-smooth distribution.
arXiv Detail & Related papers (2024-12-31T17:51:39Z)
Mirror Descent Algorithms with Nearly Dimension-Independent Rates for Differentially-Private Stochastic Saddle-Point Problems [6.431793114484429]
We propose $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ to solve the problem of differentially-private saddle-points in the polyhedral setting. We show that our algorithms attain a rate of $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ with constant success.
arXiv Detail & Related papers (2024-03-05T12:28:00Z)
Efficient Sampling on Riemannian Manifolds via Langevin MCMC [51.825900634131486]
We study the task efficiently sampling from a Gibbs distribution $d pi* = eh d vol_g$ over aian manifold $M$ via (geometric) Langevin MCMC. Our results apply to general settings where $pi*$ can be non exponential and $Mh$ can have negative Ricci curvature.
arXiv Detail & Related papers (2024-02-15T22:59:14Z)
Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix. It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$ Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z)
Concentration of the Langevin Algorithm's Stationary Distribution [29.244631254978597]
We show that for any nontrivial stepsize $eta > 0$, $pi_eta$ is sub-exponential when the potential is convex. We also show that these concentration bounds extend to all iterates along the trajectory of the Langevin Algorithm.
arXiv Detail & Related papers (2022-12-24T01:39:02Z)
Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling [17.832449046193933]
We consider the constrained sampling problem where the goal is to sample from a target distribution of $pi(x)prop ef(x)$ when $x$ is constrained. Motivated by penalty methods, we convert the constrained problem into an unconstrained sampling problem by introducing a penalty function for constraint violations. For PSGLD and PSGULMC, when $tildemathcalO(d/varepsilon18)$ is strongly convex and smooth, we obtain $tildemathcalO(d/varepsilon
arXiv Detail & Related papers (2022-11-29T18:43:22Z)
Hamiltonian Monte Carlo for efficient Gaussian sampling: long and random steps [0.0]
Hamiltonian Monte Carlo (HMC) is a Markov chain algorithm for sampling from a high-dimensional distribution with density $e-f(x)$. We show that HMC can sample from a distribution that is $varepsilon$-close in total variation distance using $widetildeO(sqrtkappa d1/4 log(1/varepsilon)$ gradient queries.
arXiv Detail & Related papers (2022-09-26T15:29:29Z)
Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics. In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector. We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z)
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP) We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
Curse of Dimensionality on Randomized Smoothing for Certifiable Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging. We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)

This list is automatically generated from the titles and abstracts of the papers in this site.