Dimension-free convergence rates for gradient Langevin dynamics in RKHS
- URL: http://arxiv.org/abs/2003.00306v2
- Date: Thu, 26 Mar 2020 10:05:35 GMT
- Title: Dimension-free convergence rates for gradient Langevin dynamics in RKHS
- Authors: Boris Muzellec, Kanji Sato, Mathurin Massias, Taiji Suzuki
- Abstract summary: Gradient Langevin dynamics (GLD) and SGLD have attracted considerable attention lately.
We provide a convergence analysis GLD and SGLD when the space is an infinite dimensional space.
- Score: 47.198067414691174
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Gradient Langevin dynamics (GLD) and stochastic GLD (SGLD) have attracted
considerable attention lately, as a way to provide convergence guarantees in a
non-convex setting. However, the known rates grow exponentially with the
dimension of the space. In this work, we provide a convergence analysis of GLD
and SGLD when the optimization space is an infinite dimensional Hilbert space.
More precisely, we derive non-asymptotic, dimension-free convergence rates for
GLD/SGLD when performing regularized non-convex optimization in a reproducing
kernel Hilbert space. Amongst others, the convergence analysis relies on the
properties of a stochastic differential equation, its discrete time Galerkin
approximation and the geometric ergodicity of the associated Markov chains.
Related papers
- Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS)
This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z) - Geometric ergodicity of SGLD via reflection coupling [7.483270274259962]
We consider the ergodicity of the Gradient Langevin Dynamics (SGLD) algorithm under nonvariantity settings.
We prove the Wasserstein corollary of SGLD when the target distribution is log-concave only outside some set of times.
arXiv Detail & Related papers (2023-01-17T09:24:32Z) - From Gradient Flow on Population Loss to Learning with Stochastic
Gradient Descent [50.4531316289086]
Gradient Descent (SGD) has been the method of choice for learning large-scale non-root models.
An overarching paper is providing general conditions SGD converges, assuming that GF on the population loss converges.
We provide a unified analysis for GD/SGD not only for classical settings like convex losses, but also for more complex problems including Retrieval Matrix sq-root.
arXiv Detail & Related papers (2022-10-13T03:55:04Z) - Time-independent Generalization Bounds for SGLD in Non-convex Settings [23.833787505938858]
We establish generalization error bounds for Langevin dynamics (SGLD) with constant learning rate under the assumptions of dissipativity and Euclidean gradient projections.
Our analysis also supports variants that use different discretization methods, or use non-is-is-noise projections.
arXiv Detail & Related papers (2021-11-25T02:31:52Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin
Dynamics [27.097121544378528]
Gradient Langevin Dynamics (SGLD) is a powerful algorithm for optimizing a non- objective gradient.
NSGLD is based on discretization of the non-reversible diffusion.
arXiv Detail & Related papers (2020-04-06T17:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.