Related papers: Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization

URL: http://arxiv.org/abs/2408.11974v2
Date: Thu, 26 Sep 2024 16:48:34 GMT
Title: Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization
Authors: Tianyi Lin, Chi Jin, Michael. I. Jordan,
Abstract summary: We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems. Our contribution is to design TTGDA algorithms are effective beyond the setting.
Score: 77.3396841985172
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We provide a unified analysis of two-timescale gradient descent ascent (TTGDA) for solving structured nonconvex minimax optimization problems in the form of $\min_\textbf{x} \max_{\textbf{y} \in Y} f(\textbf{x}, \textbf{y})$, where the objective function $f(\textbf{x}, \textbf{y})$ is nonconvex in $\textbf{x}$ and concave in $\textbf{y}$, and the constraint set $Y \subseteq \mathbb{R}^n$ is convex and bounded. In the convex-concave setting, the single-timescale gradient descent ascent (GDA) algorithm is widely used in applications and has been shown to have strong convergence guarantees. In more general settings, however, it can fail to converge. Our contribution is to design TTGDA algorithms that are effective beyond the convex-concave setting, efficiently finding a stationary point of the function $\Phi(\cdot) := \max_{\textbf{y} \in Y} f(\cdot, \textbf{y})$. We also establish theoretical bounds on the complexity of solving both smooth and nonsmooth nonconvex-concave minimax optimization problems. To the best of our knowledge, this is the first systematic analysis of TTGDA for nonconvex minimax optimization, shedding light on its superior performance in training generative adversarial networks (GANs) and in other real-world application problems.

Related papers

Obtaining Lower Query Complexities through Lightweight Zeroth-Order Proximal Gradient Algorithms [65.42376001308064]
We propose two variance reduced ZO estimators for complex gradient problems. We improve the state-of-the-art function complexities from $mathcalOleft(minfracdn1/2epsilon2, fracdepsilon3right)$ to $tildecalOleft(fracdepsilon2right)$.
arXiv Detail & Related papers (2024-10-03T15:04:01Z)
Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions [41.43895948769255]
We show a class of non-smooth non-asymptotic fairness problems in the form of $min_x[yin Yphi(x, y) - max_zin Zpsix(x, z)]$. We propose an envelope approximate gradient SMAG, the first method for solving these problems, provide a state-of-the-art non-asymptotic convergence rate.
arXiv Detail & Related papers (2024-05-28T20:52:46Z)
Nearly Optimal Regret for Decentralized Online Convex Optimization [53.433398074919]
Decentralized online convex optimization (D-OCO) aims to minimize a sequence of global loss functions using only local computations and communications. We develop novel D-OCO algorithms that can respectively reduce the regret bounds for convex and strongly convex functions. Our algorithms are nearly optimal in terms of $T$, $n$, and $rho$.
arXiv Detail & Related papers (2024-02-14T13:44:16Z)
Decentralized Riemannian Algorithm for Nonconvex Minimax Problems [82.50374560598493]
The minimax algorithms for neural networks have been developed to solve many problems. In this paper, we propose two types of minimax algorithms. For the setting, we propose DRSGDA and prove that our method achieves a gradient.
arXiv Detail & Related papers (2023-02-08T01:42:45Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z)
Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual Balancing and Iteration Complexity Analysis [28.575516056239575]
We introduce a novel analysis for PLDA, the key are our newly developed nonsmooth and dual error iterations. When $thetain [0,12]$, PLDA achieves the optimal $mathcalO()$ under mild assumptions.
arXiv Detail & Related papers (2022-09-22T07:12:48Z)
Optimal Extragradient-Based Bilinearly-Coupled Saddle-Point Optimization [116.89941263390769]
We consider the smooth convex-concave bilinearly-coupled saddle-point problem, $min_mathbfxmax_mathbfyF(mathbfx) + H(mathbfx,mathbfy)$, where one has access to first-order oracles for $F$, $G$ as well as the bilinear coupling function $H$. We present a emphaccelerated gradient-extragradient (AG-EG) descent-ascent algorithm that combines extragrad
arXiv Detail & Related papers (2022-06-17T06:10:20Z)
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and Application to Actor-Critic [142.1492359556374]
Bilevel optimization is a class of problems which exhibit a two-level structure. We propose a two-timescale approximation (TTSA) algorithm for tackling such a bilevel problem. We show that a two-timescale natural actor-critic policy optimization algorithm can be viewed as a special case of our TTSA framework.
arXiv Detail & Related papers (2020-07-10T05:20:02Z)
Gradient Free Minimax Optimization: Variance Reduction and Faster Convergence [120.9336529957224]
In this paper, we denote the non-strongly setting on the magnitude of a gradient-free minimax optimization problem. We show that a novel zeroth-order variance reduced descent algorithm achieves the best known query complexity.
arXiv Detail & Related papers (2020-06-16T17:55:46Z)
A Unified Single-loop Alternating Gradient Projection Algorithm for Nonconvex-Concave and Convex-Nonconcave Minimax Problems [8.797831153231664]
We develop an efficient algorithm for solving minimax problems with theoretical general convexnon objective guarantees. We show that the proposed algorithm can be used to solve both noncaveepsilon concave (strongly) and (strongly) convexnonconcave minimax problems.
arXiv Detail & Related papers (2020-06-03T04:00:52Z)
On Gradient Descent Ascent for Nonconvex-Concave Minimax Problems [86.92205445270427]
We consider non-con minimax problems, $min_mathbfx max_mathhidoty f(mathbfdoty)$ efficiently.
arXiv Detail & Related papers (2019-06-02T03:03:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.