Optimality and Stability in Non-Convex Smooth Games
- URL: http://arxiv.org/abs/2002.11875v3
- Date: Thu, 3 Feb 2022 16:29:21 GMT
- Title: Optimality and Stability in Non-Convex Smooth Games
- Authors: Guojun Zhang, Pascal Poupart and Yaoliang Yu
- Abstract summary: An interesting concept is known as the local minimax point, which strongly correlates with descent.
gradient algorithms can converge to local/global minimax points in the non-degenerate case, but they would often fail in general cases.
This implies the necessity of either novel algorithms or concepts beyond saddle points in non-known smooth games.
- Score: 39.365235811852365
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Convergence to a saddle point for convex-concave functions has been studied
for decades, while recent years has seen a surge of interest in non-convex
(zero-sum) smooth games, motivated by their recent wide applications. It
remains an intriguing research challenge how local optimal points are defined
and which algorithm can converge to such points. An interesting concept is
known as the local minimax point, which strongly correlates with the
widely-known gradient descent ascent algorithm. This paper aims to provide a
comprehensive analysis of local minimax points, such as their relation with
other solution concepts and their optimality conditions. We find that local
saddle points can be regarded as a special type of local minimax points, called
uniformly local minimax points, under mild continuity assumptions. In
(non-convex) quadratic games, we show that local minimax points are (in some
sense) equivalent to global minimax points. Finally, we study the stability of
gradient algorithms near local minimax points. Although gradient algorithms can
converge to local/global minimax points in the non-degenerate case, they would
often fail in general cases. This implies the necessity of either novel
algorithms or concepts beyond saddle points and minimax points in non-convex
smooth games.
Related papers
- A first-order method for nonconvex-nonconcave minimax problems under a local Kurdyka-Ćojasiewicz condition [1.534667887016089]
We study a class of non-nonconcave minimax problems in which the inner problem a local Kurdyka-Lojasiewicz (KL) condition varies with the outer minimization variable.<n>In particular, as an algorithm progresses toward a stationary point of the problem, the region over which the KL condition holds may shrink, resulting in a more potentially ill-conditioned landscape.
arXiv Detail & Related papers (2025-07-02T17:45:10Z) - On the Hardness of Meaningful Local Guarantees in Nonsmooth Nonconvex Optimization [37.41427897807821]
We show the complexity of cryptographic nonknown regular optimization.
Local algorithms acting on Lipschitz functions cannot, in the worst case, provide meaningful local in terms of value in subexponma minima.
arXiv Detail & Related papers (2024-09-16T14:35:00Z) - Riemannian stochastic optimization methods avoid strict saddle points [68.80251170757647]
We show that policies under study avoid strict saddle points / submanifolds with probability 1.
This result provides an important sanity check as it shows that, almost always, the limit state of an algorithm can only be a local minimizer.
arXiv Detail & Related papers (2023-11-04T11:12:24Z) - Two-timescale Extragradient for Finding Local Minimax Points [8.056359341994941]
We show that the two-timescale extragrad method can be a viable solution to minimax problems.
This work provably improves upon all previous results on finding local minimax points.
arXiv Detail & Related papers (2023-05-25T16:52:26Z) - Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems.
In this paper, we propose two types of minimax algorithms.
For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z) - Efficiently Escaping Saddle Points in Bilevel Optimization [48.925688192913]
Bilevel optimization is one of the problems in machine learning.
Recent developments in bilevel optimization converge on the first fundamental nonaptature multi-step analysis.
arXiv Detail & Related papers (2022-02-08T07:10:06Z) - Faster Perturbed Stochastic Gradient Methods for Finding Local Minima [92.99933928528797]
We propose tttPullback, a faster perturbed gradient framework for finding local minima.
We show that Pullback with gradient estimators such as SARAH/SP and STORM can find $(epsilon, epsilon_H)$approximate local minima within $tilde O(epsilon-3 + H-6)$.
The core idea of our framework is a step-size pullback'' scheme to control the average movement of the gradient evaluations.
arXiv Detail & Related papers (2021-10-25T07:20:05Z) - Escaping Saddle Points in Distributed Newton's Method with Communication
efficiency and Byzantine Resilience [49.379254729853756]
We consider the problem of optimizing a non-regularized loss function (with saddle points) in a distributed framework in the presence of Byzantine machines.
We robustify the cubic-regularized Newton algorithm such that it avoids the saddle points and the fake local minimas efficiently.
We obtain theoretical guarantees for our proposed scheme under several approximate settings including (sub-sampled) and Hessians.
arXiv Detail & Related papers (2021-03-17T03:53:58Z) - Escaping Saddle Points for Nonsmooth Weakly Convex Functions via
Perturbed Proximal Algorithms [0.0]
Main results are based on a novel characterization of $epsilon$-approximate local minimum for nonsmooth functions.
We show that under standard assumptions, the perturbed proximal point, perturbed proximal gradient and perturbed proximal linear algorithms find $epsilon$-approximate local minimum for nonsmooth weakly convex functions.
arXiv Detail & Related papers (2021-02-04T19:17:13Z) - The Complexity of Constrained Min-Max Optimization [29.57458485068705]
We show that an approximate local point large enough min-max is guaranteed to exist.
More importantly, we show an approximate fixed gradient Descent/Ascent approximation complete.
Our result is the first to show an exponential approximation of two fundamental optimization problems.
arXiv Detail & Related papers (2020-09-21T05:54:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.