Related papers: Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems

URL: http://arxiv.org/abs/2010.10628v2
Date: Thu, 4 Mar 2021 18:09:47 GMT
Title: Limiting Behaviors of Nonconvex-Nonconcave Minimax Optimization via Continuous-Time Systems
Authors: Benjamin Grimmer, Haihao Lu, Pratik Worah, Vahab Mirrokni
Abstract summary: We study the limiting behaviors of three classic minimax algorithms: gradient descent (AGDA), ascentGDA, and the extragradient method (EGM) Numerically, we observe that all limiting behaviors can arise in Generative Adrial Networks (GAN) and are easily demonstrated for a range of GAN problems.
Score: 10.112779201155005
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Unlike nonconvex optimization, where gradient descent is guaranteed to converge to a local optimizer, algorithms for nonconvex-nonconcave minimax optimization can have topologically different solution paths: sometimes converging to a solution, sometimes never converging and instead following a limit cycle, and sometimes diverging. In this paper, we study the limiting behaviors of three classic minimax algorithms: gradient descent ascent (GDA), alternating gradient descent ascent (AGDA), and the extragradient method (EGM). Numerically, we observe that all of these limiting behaviors can arise in Generative Adversarial Networks (GAN) training and are easily demonstrated for a range of GAN problems. To explain these different behaviors, we study the high-order resolution continuous-time dynamics that correspond to each algorithm, which results in the sufficient (and almost necessary) conditions for the local convergence by each method. Moreover, this ODE perspective allows us to characterize the phase transition between these different limiting behaviors caused by introducing regularization as Hopf Bifurcations.

Related papers

Convergence of mean-field Langevin dynamics: Time and space discretization, stochastic gradient, and variance reduction [49.66486092259376]
The mean-field Langevin dynamics (MFLD) is a nonlinear generalization of the Langevin dynamics that incorporates a distribution-dependent drift. Recent works have shown that MFLD globally minimizes an entropy-regularized convex functional in the space of measures. We provide a framework to prove a uniform-in-time propagation of chaos for MFLD that takes into account the errors due to finite-particle approximation, time-discretization, and gradient approximation.
arXiv Detail & Related papers (2023-06-12T16:28:11Z)
Accelerated First-Order Optimization under Nonlinear Constraints [73.2273449996098]
We exploit between first-order algorithms for constrained optimization and non-smooth systems to design a new class of accelerated first-order algorithms. An important property of these algorithms is that constraints are expressed in terms of velocities instead of sparse variables.
arXiv Detail & Related papers (2023-02-01T08:50:48Z)
Convergence Rates of Two-Time-Scale Gradient Descent-Ascent Dynamics for Solving Nonconvex Min-Max Problems [2.0305676256390934]
We characterize the finite-time performance of the continuous-time variant of simultaneous gradient descent-ascent algorithm. Our results on the behavior of continuous-time algorithm may be used to enhance the convergence properties of its discrete-time counterpart.
arXiv Detail & Related papers (2021-12-17T15:51:04Z)
Faster Algorithm and Sharper Analysis for Constrained Markov Decision Process [56.55075925645864]
The problem of constrained decision process (CMDP) is investigated, where an agent aims to maximize the expected accumulated discounted reward subject to multiple constraints. A new utilities-dual convex approach is proposed with novel integration of three ingredients: regularized policy, dual regularizer, and Nesterov's gradient descent dual. This is the first demonstration that nonconcave CMDP problems can attain the lower bound of $mathcal O (1/epsilon)$ for all complexity optimization subject to convex constraints.
arXiv Detail & Related papers (2021-10-20T02:57:21Z)
Quadratic Unconstrained Binary Optimisation via Quantum-Inspired Annealing [58.720142291102135]
We present a classical algorithm to find approximate solutions to instances of quadratic unconstrained binary optimisation. We benchmark our approach for large scale problem instances with tuneable hardness and planted solutions.
arXiv Detail & Related papers (2021-08-18T09:26:17Z)
Efficient Methods for Structured Nonconvex-Nonconcave Min-Max Optimization [98.0595480384208]
We propose a generalization extraient spaces which converges to a stationary point. The algorithm applies not only to general $p$-normed spaces, but also to general $p$-dimensional vector spaces.
arXiv Detail & Related papers (2020-10-31T21:35:42Z)
Conditional Gradient Methods for Convex Optimization with General Affine and Nonlinear Constraints [8.643249539674612]
This paper presents new conditional gradient methods for solving convex optimization problems with general affine and nonlinear constraints. We first present a new constraint extrapolated condition gradient (CoexCG) method that can achieve an $cal O (1/epsilon2)$ iteration complexity for both smooth and structured nonsmooth function constrained convex optimization. We further develop novel variants of CoexCG, namely constraint extrapolated and dual regularized conditional gradient (CoexDurCG) methods, that can achieve similar iteration complexity to CoexCG but allow adaptive selection for algorithmic parameters.
arXiv Detail & Related papers (2020-06-30T23:49:38Z)
Global Convergence and Variance-Reduced Optimization for a Class of Nonconvex-Nonconcave Minimax Problems [39.13924898972611]
Non minimaxewicz problems appear frequently in emerging machine learning applications generative adversarial networks and adversarial learning. GDA algorithms with constant size can potentially diverge even in the convex setting. AGDA algorithm converges globally at a rate that attains a sub rate.
arXiv Detail & Related papers (2020-02-22T04:20:37Z)
Towards Better Understanding of Adaptive Gradient Algorithms in Generative Adversarial Nets [71.05306664267832]
Adaptive algorithms perform gradient updates using the history of gradients and are ubiquitous in training deep neural networks. In this paper we analyze a variant of OptimisticOA algorithm for nonconcave minmax problems. Our experiments show that adaptive GAN non-adaptive gradient algorithms can be observed empirically.
arXiv Detail & Related papers (2019-12-26T22:10:10Z)
Blended Conditional Gradients: the unconditioning of conditional gradients [21.157733932548695]
We present a blended conditional gradient approach for minimizing a smooth convex function over a polytope P. We achieve linear convergence for strongly convex functions, along with good practical performance. The algorithm is lazy, making use of inexpensive inexact solutions of the linear programming subproblem.
arXiv Detail & Related papers (2018-05-18T16:21:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.