Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual
Balancing and Iteration Complexity Analysis
- URL: http://arxiv.org/abs/2209.10825v3
- Date: Wed, 26 Jul 2023 23:09:35 GMT
- Title: Nonsmooth Nonconvex-Nonconcave Minimax Optimization: Primal-Dual
Balancing and Iteration Complexity Analysis
- Authors: Jiajin Li, Linglingzhi Zhu and Anthony Man-Cho So
- Abstract summary: We introduce a novel analysis for PLDA, the key are our newly developed nonsmooth and dual error iterations.
When $thetain [0,12]$, PLDA achieves the optimal $mathcalO()$ under mild assumptions.
- Score: 28.575516056239575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Nonconvex-nonconcave minimax optimization has gained widespread interest over
the last decade. However, most existing works focus on variants of gradient
descent-ascent (GDA) algorithms, which are only applicable to smooth
nonconvex-concave settings. To address this limitation, we propose a novel
algorithm named smoothed proximal linear descent-ascent (smoothed PLDA), which
can effectively handle a broad range of structured nonsmooth
nonconvex-nonconcave minimax problems. Specifically, we consider the setting
where the primal function has a nonsmooth composite structure and the dual
function possesses the Kurdyka-Lojasiewicz (KL) property with exponent $\theta
\in [0,1)$. We introduce a novel convergence analysis framework for smoothed
PLDA, the key components of which are our newly developed nonsmooth primal
error bound and dual error bound. Using this framework, we show that smoothed
PLDA can find both $\epsilon$-game-stationary points and
$\epsilon$-optimization-stationary points of the problems of interest in
$\mathcal{O}(\epsilon^{-2\max\{2\theta,1\}})$ iterations. Furthermore, when
$\theta \in [0,\frac{1}{2}]$, smoothed PLDA achieves the optimal iteration
complexity of $\mathcal{O}(\epsilon^{-2})$. To further demonstrate the
effectiveness and wide applicability of our analysis framework, we show that
certain max-structured problem possesses the KL property with exponent
$\theta=0$ under mild assumptions. As a by-product, we establish
algorithm-independent quantitative relationships among various stationarity
concepts, which may be of independent interest.
Related papers
- Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.
Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - Private Stochastic Convex Optimization with Heavy Tails: Near-Optimality from Simple Reductions [19.008521454738425]
We study the problem of differentially private convex optimization (DP-SCO) with heavy-tailed gradients, where we assume a $ktextth$-moment bound on the Lipschitz constants of sample functions rather than a uniform bound.
We propose a new reduction-based approach that enables us to obtain the first optimal rates (up to logarithmic factors) in the heavy-tailed setting, achieving error $Gcdot frac 1 sqrt n + G_k cdot (fracsqrt dnepsilon)1
arXiv Detail & Related papers (2024-06-04T21:26:29Z) - Decentralized Weakly Convex Optimization Over the Stiefel Manifold [28.427697270742947]
We focus on the Stiefel manifold in the decentralized setting, where a connected network of agents in $nMn log-1)$ are tested.
We propose an method called the decentralized subgradient method (DRSM)$ for forcing a natural station below $nMn log-1)$.
arXiv Detail & Related papers (2023-03-31T02:56:23Z) - Stochastic Inexact Augmented Lagrangian Method for Nonconvex Expectation
Constrained Optimization [88.0031283949404]
Many real-world problems have complicated non functional constraints and use a large number of data points.
Our proposed method outperforms an existing method with the previously best-known result.
arXiv Detail & Related papers (2022-12-19T14:48:54Z) - Adaptive Stochastic Variance Reduction for Non-convex Finite-Sum
Minimization [52.25843977506935]
We propose an adaptive variance method, called AdaSpider, for $L$-smooth, non-reduction functions with a finitesum structure.
In doing so, we are able to compute an $epsilon-stationary point with $tildeOleft + st/epsilon calls.
arXiv Detail & Related papers (2022-11-03T14:41:46Z) - Oracle Complexity in Nonsmooth Nonconvex Optimization [49.088972349825085]
It is well-known that given a smooth, bounded-from-below $$stationary points, Oracle-based methods can find smooth approximation of smoothness.
In this paper, we prove an inherent trade-off between optimization and smoothing dimension.
arXiv Detail & Related papers (2021-04-14T10:42:45Z) - The Complexity of Nonconvex-Strongly-Concave Minimax Optimization [43.07732143522183]
This paper establishes the complexity for finding approximate stationary points of non-strongly-concave (NC-SC) smooth minimax problems.
We deploy a proposed sequence of $Omega-strong$lyconcave sub-2 problems in both general complexity and averaged complexity.
In our proposed finite-sum setting, our proposed algorithm provides a nearly-tight dependence on the condition number.
arXiv Detail & Related papers (2021-03-29T18:53:57Z) - Optimal Epoch Stochastic Gradient Descent Ascent Methods for Min-Max
Optimization [61.66927778694731]
Epoch gradient descent method (a.k.a. Epoch-GD) proposed by Hazan and Kale (2011) was deemed a breakthrough for strongly convex minimization.
Our result is the first one that shows Epoch-GDA can achieve the optimal rate of $O (1/T)$ for the duality gap of SCSC min-max problems.
arXiv Detail & Related papers (2020-02-13T02:16:57Z) - Complexity of Finding Stationary Points of Nonsmooth Nonconvex Functions [84.49087114959872]
We provide the first non-asymptotic analysis for finding stationary points of nonsmooth, nonsmooth functions.
In particular, we study Hadamard semi-differentiable functions, perhaps the largest class of nonsmooth functions.
arXiv Detail & Related papers (2020-02-10T23:23:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.