Stochastic Gradient Langevin Dynamics with Variance Reduction
- URL: http://arxiv.org/abs/2102.06759v1
- Date: Fri, 12 Feb 2021 20:22:56 GMT
- Title: Stochastic Gradient Langevin Dynamics with Variance Reduction
- Authors: Zhishen Huang, Stephen Becker
- Abstract summary: gradient Langevin dynamics (SGLD) has gained the attention of global optimization researchers.
This paper proves an improved non objective functions using accelerated property reductions.
- Score: 6.243995448840211
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Stochastic gradient Langevin dynamics (SGLD) has gained the attention of
optimization researchers due to its global optimization properties. This paper
proves an improved convergence property to local minimizers of nonconvex
objective functions using SGLD accelerated by variance reductions. Moreover, we
prove an ergodicity property of the SGLD scheme, which gives insights on its
potential to find global minimizers of nonconvex objectives.
Related papers
- Characterizing Dynamical Stability of Stochastic Gradient Descent in Overparameterized Learning [0.0]
We characterize global minima that are dynamically stable/unstable for both deterministic and gradient descent.
In particular, we introduce a characteristic Lyapunov exponent which depends on the local dynamics around a global minimum.
arXiv Detail & Related papers (2024-07-29T17:40:04Z) - Langevin Dynamics: A Unified Perspective on Optimization via Lyapunov Potentials [15.718093624695552]
We analyze the convergence of Gradient Langevin Dynamics (SGLD) to global minima based on Lyapunov potentials and optimization.
We provide 1) improved in the setting of previous works SGLD for optimization, 2) first finite gradient complexity for SGLD, and 3) prove if continuous-time Langevin Dynamics succeeds for optimization, then discrete-time SGLD succeeds under mild regularity assumptions.
arXiv Detail & Related papers (2024-07-05T05:34:10Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Implicit Bias of Gradient Descent for Logistic Regression at the Edge of
Stability [69.01076284478151]
In machine learning optimization, gradient descent (GD) often operates at the edge of stability (EoS)
This paper studies the convergence and implicit bias of constant-stepsize GD for logistic regression on linearly separable data in the EoS regime.
arXiv Detail & Related papers (2023-05-19T16:24:47Z) - Stability and Generalization Analysis of Gradient Methods for Shallow
Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability.
We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z) - Low-Precision Stochastic Gradient Langevin Dynamics [70.69923368584588]
We provide the first study of low-precision Gradient Langevin Dynamics, showing that its costs can be significantly reduced without sacrificing performance.
We develop a new quantization function for SGLD that preserves the variance in each update step.
We demonstrate that low-precision SGLD achieves comparable performance to full-precision SGLD with only 8 bits on a variety of deep learning tasks.
arXiv Detail & Related papers (2022-06-20T17:25:41Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin
Dynamics [27.097121544378528]
Gradient Langevin Dynamics (SGLD) is a powerful algorithm for optimizing a non- objective gradient.
NSGLD is based on discretization of the non-reversible diffusion.
arXiv Detail & Related papers (2020-04-06T17:11:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.