Riemannian stochastic optimization methods avoid strict saddle points
- URL: http://arxiv.org/abs/2311.02374v1
- Date: Sat, 4 Nov 2023 11:12:24 GMT
- Title: Riemannian stochastic optimization methods avoid strict saddle points
- Authors: Ya-Ping Hsieh and Mohammad Reza Karimi and Andreas Krause and
Panayotis Mertikopoulos
- Abstract summary: We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
- Score: 68.80251170757647
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many modern machine learning applications - from online principal component
analysis to covariance matrix identification and dictionary learning - can be
formulated as minimization problems on Riemannian manifolds, and are typically
solved with a Riemannian stochastic gradient method (or some variant thereof).
However, in many cases of interest, the resulting minimization problem is not
geodesically convex, so the convergence of the chosen solver to a desirable
solution - i.e., a local minimizer - is by no means guaranteed. In this paper,
we study precisely this question, that is, whether stochastic Riemannian
optimization algorithms are guaranteed to avoid saddle points with probability
1. For generality, we study a family of retraction-based methods which, in
addition to having a potentially much lower per-iteration cost relative to
Riemannian gradient descent, include other widely used algorithms, such as
natural policy gradient methods and mirror descent in ordinary convex spaces.
In this general setting, we show that, under mild assumptions for the ambient
manifold and the oracle providing gradient information, the policies under
study avoid strict saddle points / submanifolds with probability 1, from any
initial condition. This result provides an important sanity check for the use
of gradient methods on manifolds as it shows that, almost always, the limit
state of a stochastic Riemannian algorithm can only be a local minimizer.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - Curvature-Independent Last-Iterate Convergence for Games on Riemannian
Manifolds [77.4346324549323]
We show that a step size agnostic to the curvature of the manifold achieves a curvature-independent and linear last-iterate convergence rate.
To the best of our knowledge, the possibility of curvature-independent rates and/or last-iterate convergence has not been considered before.
arXiv Detail & Related papers (2023-06-29T01:20:44Z) - Manifold Free Riemannian Optimization [4.484251538832438]
A principled framework for solving optimization problems with a smooth manifold $mathcalM$ is proposed.
We use a noiseless sample set of the cost function $(x_i, y_i)in mathcalM times mathbbR$ and the intrinsic dimension of the manifold $mathcalM$.
arXiv Detail & Related papers (2022-09-07T16:19:06Z) - Riemannian Natural Gradient Methods [21.14740680011498]
We introduce the notion of Fisher information matrix in the manifold setting, which can be viewed as a natural extension of the natural gradient method.
We establish the almost-sure global convergence of our proposed method under standard assumptions.
Numerical experiments on applications arising from machine learning demonstrate the advantages of the proposed method over state-of-the-art ones.
arXiv Detail & Related papers (2022-07-15T04:33:10Z) - The Dynamics of Riemannian Robbins-Monro Algorithms [101.29301565229265]
We propose a family of Riemannian algorithms generalizing and extending the seminal approximation framework of Robbins and Monro.
Compared to their Euclidean counterparts, Riemannian algorithms are much less understood due to lack of a global linear structure on the manifold.
We provide a general template of almost sure convergence results that mirrors and extends the existing theory for Euclidean Robbins-Monro schemes.
arXiv Detail & Related papers (2022-06-14T12:30:11Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - A Riemannian smoothing steepest descent method for non-Lipschitz
optimization on submanifolds [15.994495285950293]
A smoothing steepest descent method is proposed to minimize a non-Lipschitz function on submanifolds.
We prove any point of the sequence generated by the smoothing function is a limiting point of the original non-Lipschitz problem.
Numerical experiments are conducted to demonstrate the efficiency of the proposed method.
arXiv Detail & Related papers (2021-04-09T05:38:28Z) - Gradient-Free Methods for Saddle-Point Problem [125.99533416395765]
We generalize the approach Gasnikov et. al., 2017, which allows to solve (stochastic) convex optimization problems with an inexact gradient-free oracle.
Our approach reduces $fracnlog n$ times the required number of oracle calls.
In the second part of the paper, we analyze the case when such an assumption cannot be made, we propose a general approach on how to modernize the method to solve this problem.
arXiv Detail & Related papers (2020-05-12T16:44:27Z) - Stochastic Zeroth-order Riemannian Derivative Estimation and
Optimization [15.78743548731191]
We propose an oracle version of the Gaussian smoothing function to overcome the difficulty of non-linearity of manifold non-linearity.
We demonstrate the applicability of our algorithms by results and real-world applications on black-box stiffness control for robotics and black-box attacks to neural networks.
arXiv Detail & Related papers (2020-03-25T06:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.