Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max
Optimization
- URL: http://arxiv.org/abs/2104.08708v1
- Date: Sun, 18 Apr 2021 04:30:01 GMT
- Title: Complexity Lower Bounds for Nonconvex-Strongly-Concave Min-Max
Optimization
- Authors: Haochuan Li, Yi Tian, Jingzhao Zhang, Ali Jadbabaie
- Abstract summary: We provide a first-order oracle lower bound for finding stationary points of min-max optimization problems.
Our analysis shows that the upper bound is optimal in the $epsilon$ dependence up to $kappa$.
It suggests that there is a significant gap between the upper $mathcalOkappa3 epsilon-4)$ in (Lin et al., 2020a) and our lower bound in the approximate number dependence.
- Score: 31.0295459253155
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide a first-order oracle complexity lower bound for finding stationary
points of min-max optimization problems where the objective function is smooth,
nonconvex in the minimization variable, and strongly concave in the
maximization variable. We establish a lower bound of
$\Omega\left(\sqrt{\kappa}\epsilon^{-2}\right)$ for deterministic oracles,
where $\epsilon$ defines the level of approximate stationarity and $\kappa$ is
the condition number. Our analysis shows that the upper bound achieved in (Lin
et al., 2020b) is optimal in the $\epsilon$ and $\kappa$ dependence up to
logarithmic factors. For stochastic oracles, we provide a lower bound of
$\Omega\left(\sqrt{\kappa}\epsilon^{-2} + \kappa^{1/3}\epsilon^{-4}\right)$. It
suggests that there is a significant gap between the upper bound
$\mathcal{O}(\kappa^3 \epsilon^{-4})$ in (Lin et al., 2020a) and our lower
bound in the condition number dependence.
Related papers
- On the Complexity of First-Order Methods in Stochastic Bilevel
Optimization [9.649991673557167]
We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex.
Existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them.
We propose a simple first-order method that converges to an $epsilon$ stationary point using $O(epsilon-6), O(epsilon-4)$ access to first-order $y*$-aware oracles.
arXiv Detail & Related papers (2024-02-11T04:26:35Z) - The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem.
We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z) - Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems.
Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z) - The First Optimal Algorithm for Smooth and
Strongly-Convex-Strongly-Concave Minimax Optimization [88.91190483500932]
In this paper, we revisit the smooth and strongly-strongly-concave minimax optimization problem.
Existing state-of-the-art methods do not match lower bound $Omegaleft(sqrtkappa_xkappa_ylog3 (kappa_xkappa_y)logfrac1epsilonright)$.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft( sqrtkappa_xkappa_ylog
arXiv Detail & Related papers (2022-05-11T17:33:07Z) - Finding Second-Order Stationary Point for Nonconvex-Strongly-Concave
Minimax Problem [16.689304539024036]
In this paper, we consider non-asymotic behavior of finding second-order stationary point for minimax problem.
For high-dimensional problems, we propose anf to expensive cost form second-order oracle which solves the cubic sub-problem in gradient via descent and Chebyshev expansion.
arXiv Detail & Related papers (2021-10-10T14:54:23Z) - Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss [41.17536985461902]
We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors.
We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
arXiv Detail & Related papers (2021-05-04T21:49:15Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Agnostic Q-learning with Function Approximation in Deterministic
Systems: Tight Bounds on Approximation Error and Sample Complexity [94.37110094442136]
We study the problem of agnostic $Q$-learning with function approximation in deterministic systems.
We show that if $delta = Oleft(rho/sqrtdim_Eright)$, then one can find the optimal policy using $Oleft(dim_Eright)$.
arXiv Detail & Related papers (2020-02-17T18:41:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.