STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium
in Nonconvex-Nonconcave Games
- URL: http://arxiv.org/abs/2210.09769v1
- Date: Tue, 18 Oct 2022 11:33:30 GMT
- Title: STay-ON-the-Ridge: Guaranteed Convergence to Local Minimax Equilibrium
in Nonconvex-Nonconcave Games
- Authors: Constantinos Daskalakis, Noah Golowich, Stratis Skoulakis and Manolis
Zampetakis
- Abstract summary: We propose a method that is guaranteed to converge min-max cycles for non-noncon objectives.
Our method is designed to satisfy the potential nature of the problem.
- Score: 34.04699387005116
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Min-max optimization problems involving nonconvex-nonconcave objectives have
found important applications in adversarial training and other multi-agent
learning settings. Yet, no known gradient descent-based method is guaranteed to
converge to (even local notions of) min-max equilibrium in the
nonconvex-nonconcave setting. For all known methods, there exist relatively
simple objectives for which they cycle or exhibit other undesirable behavior
different from converging to a point, let alone to some game-theoretically
meaningful one~\cite{flokas2019poincare,hsieh2021limits}. The only known
convergence guarantees hold under the strong assumption that the initialization
is very close to a local min-max equilibrium~\cite{wang2019solving}. Moreover,
the afore-described challenges are not just theoretical curiosities. All known
methods are unstable in practice, even in simple settings.
We propose the first method that is guaranteed to converge to a local min-max
equilibrium for smooth nonconvex-nonconcave objectives. Our method is
second-order and provably escapes limit cycles as long as it is initialized at
an easy-to-find initial point. Both the definition of our method and its
convergence analysis are motivated by the topological nature of the problem. In
particular, our method is not designed to decrease some potential function,
such as the distance of its iterate from the set of local min-max equilibria or
the projected gradient of the objective, but is designed to satisfy a
topological property that guarantees the avoidance of cycles and implies its
convergence.
Related papers
- Trust-Region Sequential Quadratic Programming for Stochastic Optimization with Random Models [57.52124921268249]
We propose a Trust Sequential Quadratic Programming method to find both first and second-order stationary points.
To converge to first-order stationary points, our method computes a gradient step in each iteration defined by minimizing a approximation of the objective subject.
To converge to second-order stationary points, our method additionally computes an eigen step to explore the negative curvature the reduced Hessian matrix.
arXiv Detail & Related papers (2024-09-24T04:39:47Z) - Can Decentralized Stochastic Minimax Optimization Algorithms Converge
Linearly for Finite-Sum Nonconvex-Nonconcave Problems? [56.62372517641597]
Decentralized minimax optimization has been actively studied in the past few years due to its application in a wide range machine learning.
This paper develops two novel decentralized minimax optimization algorithms for the non-strongly-nonconcave problem.
arXiv Detail & Related papers (2023-04-24T02:19:39Z) - Beyond first-order methods for non-convex non-concave min-max
optimization [6.097141897343098]
We show improvements attainable beyond the monotone and Minty condition settings.
Specifically, we provide a new understanding of discrete-time minimization.
We present a continuoustime analysis alongside which match those for the discretetime setting.
arXiv Detail & Related papers (2023-04-17T15:55:40Z) - PAPAL: A Provable PArticle-based Primal-Dual ALgorithm for Mixed Nash Equilibrium [58.26573117273626]
We consider the non-AL equilibrium nonconptotic objective function in two-player zero-sum continuous games.
Our novel insights into the particle-based algorithms for continuous distribution strategies are presented.
arXiv Detail & Related papers (2023-03-02T05:08:15Z) - Cyclic Block Coordinate Descent With Variance Reduction for Composite
Nonconvex Optimization [26.218670461973705]
We propose methods for solving problems coordinate non-asymptotic gradient norm guarantees.
Our results demonstrate the efficacy of the proposed cyclic-reduced scheme in training deep neural nets.
arXiv Detail & Related papers (2022-12-09T19:17:39Z) - Stochastic Gradient Descent-Ascent and Consensus Optimization for Smooth
Games: Convergence Analysis under Expected Co-coercivity [49.66890309455787]
We introduce the expected co-coercivity condition, explain its benefits, and provide the first last-iterate convergence guarantees of SGDA and SCO.
We prove linear convergence of both methods to a neighborhood of the solution when they use constant step-size.
Our convergence guarantees hold under the arbitrary sampling paradigm, and we give insights into the complexity of minibatching.
arXiv Detail & Related papers (2021-06-30T18:32:46Z) - On the Convergence of Stochastic Extragradient for Bilinear Games with
Restarted Iteration Averaging [96.13485146617322]
We present an analysis of the ExtraGradient (SEG) method with constant step size, and present variations of the method that yield favorable convergence.
We prove that when augmented with averaging, SEG provably converges to the Nash equilibrium, and such a rate is provably accelerated by incorporating a scheduled restarting procedure.
arXiv Detail & Related papers (2021-06-30T17:51:36Z) - Direct-Search for a Class of Stochastic Min-Max Problems [0.0]
We investigate the use of derivative-search methods that only access objective through oracle.
We prove convergence of this technique under mild assumptions.
Our analysis is the first one to address the convergence of a direct-search method for minmax objectives in a setting.
arXiv Detail & Related papers (2021-02-22T22:23:58Z) - Stochastic optimization with momentum: convergence, fluctuations, and
traps avoidance [0.0]
In this paper, a general optimization procedure is studied, unifying several variants of the gradient descent such as, among others, the heavy ball method, the Nesterov Accelerated Gradient (S-NAG), and the widely used Adam method.
The avoidance is studied as a noisy discretization of a non-autonomous ordinary differential equation.
arXiv Detail & Related papers (2020-12-07T19:14:49Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - The Landscape of the Proximal Point Method for Nonconvex-Nonconcave
Minimax Optimization [10.112779201155005]
Minimax PPM has become a central tool in machine learning with applications in robust, reinforcement learning, GANs, etc.
These applications are often non-concave, but existing theory is unable to identify this and the fundamental difficulties.
arXiv Detail & Related papers (2020-06-15T18:17:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.