Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via
Continuous-Time Systems
- URL: http://arxiv.org/abs/2010.10628v2
- Date: Thu, 4 Mar 2021 18:09:47 GMT
- Title: Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via
Continuous-Time Systems
- Authors: Benjamin Grimmer, Haihao Lu, Pratik Worah, Vahab Mirrokni
- Abstract summary: We study the limiting behaviors of three classic minimax algorithms: gradient descent (AGDA), ascentGDA, and the extragradient method (EGM)
Numerically, we observe that all limiting behaviors can arise in Generative Adrial Networks (GAN) and are easily demonstrated for a range of GAN problems.
- Score: 10.112779201155005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unlike nonconvex optimization, where gradient descent is guaranteed to
converge to a local optimizer, algorithms for nonconvex-nonconcave minimax
optimization can have topologically different solution paths: sometimes
converging to a solution, sometimes never converging and instead following a
limit cycle, and sometimes diverging. In this paper, we study the limiting
behaviors of three classic minimax algorithms: gradient descent ascent (GDA),
alternating gradient descent ascent (AGDA), and the extragradient method (EGM).
Numerically, we observe that all of these limiting behaviors can arise in
Generative Adversarial Networks (GAN) training and are easily demonstrated for
a range of GAN problems. To explain these different behaviors, we study the
high-order resolution continuous-time dynamics that correspond to each
algorithm, which results in the sufficient (and almost necessary) conditions
for the local convergence by each method. Moreover, this ODE perspective allows
us to characterize the phase transition between these different limiting
behaviors caused by introducing regularization as Hopf Bifurcations.
Related papers
- Convergence of mean-field Langevin dynamics: Time and space
discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift.
Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures.
We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z) - Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms.
An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z) - Convergence Rates of Two-Time-Scale Gradient Descent-Ascent Dynamics for
Solving Nonconvex Min-Max Problems [2.0305676256390934]
We characterize the finite-time performance of the continuous-time variant of simultaneous gradient descent-ascent algorithm.
Our results on the behavior of continuous-time algorithm may be used to enhance the convergence properties of its discrete-time counterpart.
arXiv Detail & Related papers (2021-12-17T15:51:04Z) - Faster Algorithm and Sharper Analysis for Constrained Markov Decision
Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints.
A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual.
This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z) - Quadratic Unconstrained Binary Optimisation via Quantum-Inspired
Annealing [58.720142291102135]
We present a classical algorithm to find approximate solutions to instances of quadratic unconstrained binary optimisation.
We benchmark our approach for large scale problem instances with tuneable hardness and planted solutions.
arXiv Detail & Related papers (2021-08-18T09:26:17Z) - Efficient Methods for Structured Nonconvex-Nonconcave Min-Max
Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point.
The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z) - Conditional Gradient Methods for Convex Optimization with General Affine
and Nonlinear Constraints [8.643249539674612]
This paper presents new conditional gradient methods for solving convex optimization problems with general affine and nonlinear constraints.
We first present a new constraint extrapolated condition gradient (CoexCG) method that can achieve an $cal O (1/epsilon2)$ iteration complexity for both smooth and structured nonsmooth function constrained convex optimization.
We further develop novel variants of CoexCG, namely constraint extrapolated and dual regularized conditional gradient (CoexDurCG) methods, that can achieve similar iteration complexity to CoexCG but allow adaptive selection for algorithmic parameters.
arXiv Detail & Related papers (2020-06-30T23:49:38Z) - Global Convergence and Variance-Reduced Optimization for a Class of
Nonconvex-Nonconcave Minimax Problems [39.13924898972611]
Non minimaxewicz problems appear frequently in emerging machine learning applications generative adversarial networks and adversarial learning.
GDA algorithms with constant size can potentially diverge even in the convex setting.
AGDA algorithm converges globally at a rate that attains a sub rate.
arXiv Detail & Related papers (2020-02-22T04:20:37Z) - Towards Better Understanding of Adaptive Gradient Algorithms in
Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks.
In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems.
Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.