Exit Time Analysis for Approximations of Gradient Descent Trajectories
Around Saddle Points
- URL: http://arxiv.org/abs/2006.01106v2
- Date: Tue, 1 Feb 2022 19:28:56 GMT
- Title: Exit Time Analysis for Approximations of Gradient Descent Trajectories
Around Saddle Points
- Authors: Rishabh Dixit, Mert Gurbuzbalaban, and Waheed U. Bajwa
- Abstract summary: This paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods.
It provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions.
The analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions.
- Score: 9.66353475649122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers the problem of understanding the exit time for
trajectories of gradient-related first-order methods from saddle neighborhoods
under some initial boundary conditions. Given the `flat' geometry around saddle
points, first-order methods can struggle to escape these regions in a fast
manner due to the small magnitudes of gradients encountered. In particular,
while it is known that gradient-related first-order methods escape
strict-saddle neighborhoods, existing analytic techniques do not explicitly
leverage the local geometry around saddle points in order to control behavior
of gradient trajectories. It is in this context that this paper puts forth a
rigorous geometric analysis of the gradient-descent method around strict-saddle
neighborhoods using matrix perturbation theory. In doing so, it provides a key
result that can be used to generate an approximate gradient trajectory for any
given initial conditions. In addition, the analysis leads to a linear exit-time
solution for gradient-descent method under certain necessary initial
conditions, which explicitly bring out the dependence on problem dimension,
conditioning of the saddle neighborhood, and more, for a class of strict-saddle
functions.
Related papers
- Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Accelerated gradient methods for nonconvex optimization: Escape
trajectories from strict saddle points and convergence to local minima [9.66353475649122]
This paper considers the problem.
of understanding convex behavior of a general general of accelerated gradient methods on.
non-aptotic functions.
It shows that Nesterov's accelerated method (NAG) with variable momentum avoids strict saddle points almost surely.
arXiv Detail & Related papers (2023-07-13T19:11:07Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys.
Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z) - Almost Sure Saddle Avoidance of Stochastic Gradient Methods without the
Bounded Gradient Assumption [11.367487348673793]
We prove that various gradient descent methods, including the gradient descent (SGD), heavy-ball (SHB) and Nesterov's accelerated gradient (SNAG) methods, almost surely avoid any strict saddle manifold.
This is the first time such results are obtained for SHB and SNAG methods.
arXiv Detail & Related papers (2023-02-15T18:53:41Z) - Min-Max Optimization Made Simple: Approximating the Proximal Point
Method via Contraction Maps [77.8999425439444]
We present a first-order method that admits near-optimal convergence rates for convex/concave min-max problems.
Our work is based on the fact that the update rule of the Proximal Point method can be approximated up to accuracy.
arXiv Detail & Related papers (2023-01-10T12:18:47Z) - HOUDINI: Escaping from Moderately Constrained Saddles [14.277428617774875]
We show that (noisy) gradient descent methods can escape from saddle points under a logarithmic number of inequality constraints.
Our results hold for both regular and gradient descent.
arXiv Detail & Related papers (2022-05-27T03:36:27Z) - Point Cloud Denoising via Momentum Ascent in Gradient Fields [72.93429911044903]
gradient-based method was proposed to estimate the gradient fields from the noisy point clouds using neural networks.
We develop a momentum gradient ascent method that leverages the information of previous iterations in determining the trajectories of the points.
Experiments demonstrate that the proposed method outperforms state-of-the-art approaches with a variety of point clouds, noise types, and noise levels.
arXiv Detail & Related papers (2022-02-21T10:21:40Z) - Leveraging Non-uniformity in First-order Non-convex Optimization [93.6817946818977]
Non-uniform refinement of objective functions leads to emphNon-uniform Smoothness (NS) and emphNon-uniform Lojasiewicz inequality (NL)
New definitions inspire new geometry-aware first-order methods that converge to global optimality faster than the classical $Omega (1/t2)$ lower bounds.
arXiv Detail & Related papers (2021-05-13T04:23:07Z) - Boundary Conditions for Linear Exit Time Gradient Trajectories Around
Saddle Points: Analysis and Algorithm [9.69596041242667]
An understanding of multiple objective functions in a landscape of strict-saddle points is presented.
An analysis of the neighborhoods convergence to a local landscape that has a maximum of strict-saddle points is also presented.
arXiv Detail & Related papers (2021-01-07T16:59:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.