Stochastic Langevin Monte Carlo for (weakly) log-concave posterior
distributions
- URL: http://arxiv.org/abs/2301.03077v1
- Date: Sun, 8 Jan 2023 17:08:21 GMT
- Title: Stochastic Langevin Monte Carlo for (weakly) log-concave posterior
distributions
- Authors: Marelys Crespo Navas, S\'ebastien Gadat, Xavier Gendre
- Abstract summary: We investigate a continuous time version of the Langevin Monte Carlo method, introduced in [WT11], that incorporates a sampling step inside the traditional over-damped Langevin diffusion.
This method is popular in machine learning for sampling posterior distribution.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we investigate a continuous time version of the Stochastic
Langevin Monte Carlo method, introduced in [WT11], that incorporates a
stochastic sampling step inside the traditional over-damped Langevin diffusion.
This method is popular in machine learning for sampling posterior distribution.
We will pay specific attention in our work to the computational cost in terms
of $n$ (the number of observations that produces the posterior distribution),
and $d$ (the dimension of the ambient space where the parameter of interest is
living). We derive our analysis in the weakly convex framework, which is
parameterized with the help of the Kurdyka-\L ojasiewicz (KL) inequality, that
permits to handle a vanishing curvature settings, which is far less restrictive
when compared to the simple strongly convex case. We establish that the final
horizon of simulation to obtain an $\varepsilon$ approximation (in terms of
entropy) is of the order $( d \log(n)^2 )^{(1+r)^2} [\log^2(\varepsilon^{-1}) +
n^2 d^{2(1+r)} \log^{4(1+r)}(n) ]$ with a Poissonian subsampling of parameter
$\left(n ( d \log^2(n))^{1+r}\right)^{-1}$, where the parameter $r$ is involved
in the KL inequality and varies between $0$ (strongly convex case) and $1$
(limiting Laplace situation).
Related papers
- Mirror Descent Algorithms with Nearly Dimension-Independent Rates for
Differentially-Private Stochastic Saddle-Point Problems [6.431793114484429]
We propose $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ to solve the problem of differentially-private saddle-points in the polyhedral setting.
We show that our algorithms attain a rate of $sqrtlog(d)/sqrtn + log(d)/[nvarepsilon]2/5$ with constant success.
arXiv Detail & Related papers (2024-03-05T12:28:00Z) - Efficient Sampling on Riemannian Manifolds via Langevin MCMC [51.825900634131486]
We study the task efficiently sampling from a Gibbs distribution $d pi* = eh d vol_g$ over aian manifold $M$ via (geometric) Langevin MCMC.
Our results apply to general settings where $pi*$ can be non exponential and $Mh$ can have negative Ricci curvature.
arXiv Detail & Related papers (2024-02-15T22:59:14Z) - Efficient Sampling of Stochastic Differential Equations with Positive
Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix.
It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$
Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z) - Stochastic Approximation Approaches to Group Distributionally Robust
Optimization [96.26317627118912]
Group distributionally robust optimization (GDRO)
Online learning techniques to reduce the number of samples required in each round from $m$ to $1$, keeping the same sample.
A novel formulation of weighted GDRO, which allows us to derive distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Concentration of the Langevin Algorithm's Stationary Distribution [29.244631254978597]
We show that for any nontrivial stepsize $eta > 0$, $pi_eta$ is sub-exponential when the potential is convex.
We also show that these concentration bounds extend to all iterates along the trajectory of the Langevin Algorithm.
arXiv Detail & Related papers (2022-12-24T01:39:02Z) - Penalized Overdamped and Underdamped Langevin Monte Carlo Algorithms for Constrained Sampling [17.832449046193933]
We consider the constrained sampling problem where the goal is to sample from a target distribution of $pi(x)prop ef(x)$ when $x$ is constrained.
Motivated by penalty methods, we convert the constrained problem into an unconstrained sampling problem by introducing a penalty function for constraint violations.
For PSGLD and PSGULMC, when $tildemathcalO(d/varepsilon18)$ is strongly convex and smooth, we obtain $tildemathcalO(d/varepsilon
arXiv Detail & Related papers (2022-11-29T18:43:22Z) - Hamiltonian Monte Carlo for efficient Gaussian sampling: long and random
steps [0.0]
Hamiltonian Monte Carlo (HMC) is a Markov chain algorithm for sampling from a high-dimensional distribution with density $e-f(x)$.
We show that HMC can sample from a distribution that is $varepsilon$-close in total variation distance using $widetildeO(sqrtkappa d1/4 log(1/varepsilon)$ gradient queries.
arXiv Detail & Related papers (2022-09-26T15:29:29Z) - Convergence of Langevin Monte Carlo in Chi-Squared and Renyi Divergence [8.873449722727026]
We show that the rate estimate $widetildemathcalO(depsilon-1)$ improves the previously known rates in both of these metrics.
In particular, for convex and firstorder smooth potentials, we show that LMC algorithm achieves the rate estimate $widetildemathcalO(depsilon-1)$ which improves the previously known rates in both of these metrics.
arXiv Detail & Related papers (2020-07-22T18:18:28Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Linear Time Sinkhorn Divergences using Positive Features [51.50788603386766]
Solving optimal transport with an entropic regularization requires computing a $ntimes n$ kernel matrix that is repeatedly applied to a vector.
We propose to use instead ground costs of the form $c(x,y)=-logdotpvarphi(x)varphi(y)$ where $varphi$ is a map from the ground space onto the positive orthant $RRr_+$, with $rll n$.
arXiv Detail & Related papers (2020-06-12T10:21:40Z) - Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and
Variance Reduction [63.41789556777387]
Asynchronous Q-learning aims to learn the optimal action-value function (or Q-function) of a Markov decision process (MDP)
We show that the number of samples needed to yield an entrywise $varepsilon$-accurate estimate of the Q-function is at most on the order of $frac1mu_min (1-gamma)5varepsilon2+ fract_mixmu_min (1-gamma)$ up to some logarithmic factor.
arXiv Detail & Related papers (2020-06-04T17:51:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.