Related papers: Boundary Conditions for Linear Exit Time Gradient Trajectories Around Saddle Points: Analysis and Algorithm

Boundary Conditions for Linear Exit Time Gradient Trajectories Around Saddle Points: Analysis and Algorithm

URL: http://arxiv.org/abs/2101.02625v1
Date: Thu, 7 Jan 2021 16:59:15 GMT
Title: Boundary Conditions for Linear Exit Time Gradient Trajectories Around Saddle Points: Analysis and Algorithm
Authors: Rishabh Dixit and Waheed U. Bajwa
Abstract summary: An understanding of multiple objective functions in a landscape of strict-saddle points is presented. An analysis of the neighborhoods convergence to a local landscape that has a maximum of strict-saddle points is also presented.
Score: 9.69596041242667
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gradient-related first-order methods have become the workhorse of large-scale numerical optimization problems. Many of these problems involve nonconvex objective functions with multiple saddle points, which necessitates an understanding of the behavior of discrete trajectories of first-order methods within the geometrical landscape of these functions. This paper concerns convergence of first-order discrete methods to a local minimum of nonconvex optimization problems that comprise strict saddle points within the geometrical landscape. To this end, it focuses on analysis of discrete gradient trajectories around saddle neighborhoods, derives sufficient conditions under which these trajectories can escape strict-saddle neighborhoods in linear time, explores the contractive and expansive dynamics of these trajectories in neighborhoods of strict-saddle points that are characterized by gradients of moderate magnitude, characterizes the non-curving nature of these trajectories, and highlights the inability of these trajectories to re-enter the neighborhoods around strict-saddle points after exiting them. Based on these insights and analyses, the paper then proposes a simple variant of the vanilla gradient descent algorithm, termed Curvature Conditioned Regularized Gradient Descent (CCRGD) algorithm, which utilizes a check for an initial boundary condition to ensure its trajectories can escape strict-saddle neighborhoods in linear time. Convergence analysis of the CCRGD algorithm, which includes its rate of convergence to a local minimum within a geometrical landscape that has a maximum number of strict-saddle points, is also presented in the paper. Numerical experiments are then provided on a test function as well as a low-rank matrix factorization problem to evaluate the efficacy of the proposed algorithm.

Related papers

Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points. To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject. To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z)
Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1. This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z)
Accelerated gradient methods for nonconvex optimization: Escape trajectories from strict saddle points and convergence to local minima [9.66353475649122]
This paper considers the problem. of understanding convex behavior of a general general of accelerated gradient methods on. non-aptotic functions. It shows that Nesterov's accelerated method (NAG) with variable momentum avoids strict saddle points almost surely.
arXiv Detail & Related papers (2023-07-13T19:11:07Z)
Curvature-Independent Last-Iterate Convergence for Games on Riemannian Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate. To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z)
Convex and Non-convex Optimization Under Generalized Smoothness [69.69521650503431]
An analysis of convex and non- optimization methods often requires the Lipsitzness gradient, which limits the analysis by this trajectorys. Recent work generalizes the gradient setting via the non-uniform smoothness condition.
arXiv Detail & Related papers (2023-06-02T04:21:59Z)
A Stochastic-Gradient-based Interior-Point Algorithm for Solving Smooth Bound-Constrained Optimization Problems [12.29270365918848]
The proposed algorithm is based on the subject-point unique constraints from other interior-point methods. It is shown that with a careful balance between the projection, step-size and sequence sequences, the proposed algorithm convergence guarantees in both numerical and deterministic settings.
arXiv Detail & Related papers (2023-04-28T15:30:43Z)
Subgradient methods near active manifolds: saddle point avoidance, local convergence, and asymptotic normality [4.709588811674973]
We show that aiming and subgradient approximation fully expose the smooth substructure of the problem. We prove these properties hold for a wide class of problems, including cone reducible/decomposable functions and generic semialgebraic problems. The normality results appear to be new even in the most classical setting.
arXiv Detail & Related papers (2021-08-26T15:02:16Z)
Stochasticity helps to navigate rough landscapes: comparing gradient-descent-based algorithms in the phase retrieval problem [8.164433158925593]
We investigate how analytically-based algorithms such as dynamical descent, its persistent gradient, and Langevin landscape descent can be studied. We apply statistical-field theory from statistical trajectories to these algorithms in their full-time limit, with a start, and for large system sizes.
arXiv Detail & Related papers (2021-03-08T17:06:18Z)
Conditional gradient methods for stochastically constrained convex minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems. The most important feature of our framework is that only a subset of the constraints is processed at each iteration. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
Exit Time Analysis for Approximations of Gradient Descent Trajectories Around Saddle Points [9.66353475649122]
This paper puts forth a rigorous geometric analysis of the gradient-descent method around strict-saddle neighborhoods. It provides a key result that can be used to generate an approximate gradient trajectory for any given initial conditions. The analysis leads to a linear exit-time solution for gradient-descent method under certain necessary initial conditions.
arXiv Detail & Related papers (2020-06-01T17:47:00Z)
Second-Order Guarantees in Centralized, Federated and Decentralized Nonconvex Optimization [64.26238893241322]
Simple algorithms have been shown to lead to good empirical results in many contexts. Several works have pursued rigorous analytical justification for studying non optimization problems. A key insight in these analyses is that perturbations play a critical role in allowing local descent algorithms.
arXiv Detail & Related papers (2020-03-31T16:54:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.