Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin
Dynamics
- URL: http://arxiv.org/abs/2004.02823v2
- Date: Tue, 2 Jun 2020 20:49:37 GMT
- Title: Non-Convex Optimization via Non-Reversible Stochastic Gradient Langevin
Dynamics
- Authors: Yuanhan Hu, Xiaoyu Wang, Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong
Zhu
- Abstract summary: Gradient Langevin Dynamics (SGLD) is a powerful algorithm for optimizing a non- objective gradient.
NSGLD is based on discretization of the non-reversible diffusion.
- Score: 27.097121544378528
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Stochastic Gradient Langevin Dynamics (SGLD) is a powerful algorithm for
optimizing a non-convex objective, where a controlled and properly scaled
Gaussian noise is added to the stochastic gradients to steer the iterates
towards a global minimum. SGLD is based on the overdamped Langevin diffusion
which is reversible in time. By adding an anti-symmetric matrix to the drift
term of the overdamped Langevin diffusion, one gets a non-reversible diffusion
that converges to the same stationary distribution with a faster convergence
rate. In this paper, we study the non reversible Stochastic Gradient Langevin
Dynamics (NSGLD) which is based on discretization of the non-reversible
Langevin diffusion. We provide finite-time performance bounds for the global
convergence of NSGLD for solving stochastic non-convex optimization problems.
Our results lead to non-asymptotic guarantees for both population and empirical
risk minimization problems. Numerical experiments for Bayesian independent
component analysis and neural network models show that NSGLD can outperform
SGLD with proper choices of the anti-symmetric matrix.
Related papers
- Faster Sampling via Stochastic Gradient Proximal Sampler [28.422547264326468]
Proximal samplers (SPS) for sampling from non-log-concave distributions are studied.
We show that the convergence to the target distribution can be guaranteed as long as the algorithm trajectory is bounded.
We provide two implementable variants based on Langevin dynamics (SGLD) and Langevin-MALA, giving rise to SPS-SGLD and SPS-MALA.
arXiv Detail & Related papers (2024-05-27T00:53:18Z) - Symmetric Mean-field Langevin Dynamics for Distributional Minimax
Problems [78.96969465641024]
We extend mean-field Langevin dynamics to minimax optimization over probability distributions for the first time with symmetric and provably convergent updates.
We also study time and particle discretization regimes and prove a new uniform-in-time propagation of chaos result.
arXiv Detail & Related papers (2023-12-02T13:01:29Z) - Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Constrained Langevin Algorithms with L-mixing External Random Variables [0.0]
Langevin sampling algorithms are augmented with additive noise.
Non-symotic analysis of Langevin algorithms for learning has been extensively explored.
We show that Langevin algorithm achieves a deviation of 22 points.
arXiv Detail & Related papers (2022-05-27T18:44:35Z) - Improved Convergence Rate of Stochastic Gradient Langevin Dynamics with
Variance Reduction and its Application to Optimization [50.83356836818667]
gradient Langevin Dynamics is one of the most fundamental algorithms to solve non-eps optimization problems.
In this paper, we show two variants of this kind, namely the Variance Reduced Langevin Dynamics and the Recursive Gradient Langevin Dynamics.
arXiv Detail & Related papers (2022-03-30T11:39:00Z) - Nonconvex Stochastic Scaled-Gradient Descent and Generalized Eigenvector
Problems [98.34292831923335]
Motivated by the problem of online correlation analysis, we propose the emphStochastic Scaled-Gradient Descent (SSD) algorithm.
We bring these ideas together in an application to online correlation analysis, deriving for the first time an optimal one-time-scale algorithm with an explicit rate of local convergence to normality.
arXiv Detail & Related papers (2021-12-29T18:46:52Z) - A Contour Stochastic Gradient Langevin Dynamics Algorithm for
Simulations of Multi-modal Distributions [17.14287157979558]
We propose an adaptively weighted gradient Langevin dynamics (SGLD) for learning in big data statistics.
The proposed algorithm is tested on benchmark datasets including CIFAR100.
arXiv Detail & Related papers (2020-10-19T19:20:47Z) - Faster Convergence of Stochastic Gradient Langevin Dynamics for
Non-Log-Concave Sampling [110.88857917726276]
We provide a new convergence analysis of gradient Langevin dynamics (SGLD) for sampling from a class of distributions that can be non-log-concave.
At the core of our approach is a novel conductance analysis of SGLD using an auxiliary time-reversible Markov Chain.
arXiv Detail & Related papers (2020-10-19T15:23:18Z) - Dimension-free convergence rates for gradient Langevin dynamics in RKHS [47.198067414691174]
Gradient Langevin dynamics (GLD) and SGLD have attracted considerable attention lately.
We provide a convergence analysis GLD and SGLD when the space is an infinite dimensional space.
arXiv Detail & Related papers (2020-02-29T17:14:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.