Faster Stochastic Algorithms for Minimax Optimization under
Polyak--{\L}ojasiewicz Conditions
- URL: http://arxiv.org/abs/2307.15868v1
- Date: Sat, 29 Jul 2023 02:26:31 GMT
- Title: Faster Stochastic Algorithms for Minimax Optimization under
Polyak--{\L}ojasiewicz Conditions
- Authors: Lesi Chen, Boyuan Yao, Luo Luo
- Abstract summary: We propose SPIDER-GDA for solving the finite-sum problem of the form $min_x max_y f(x,y)triqangle frac1n sum_i=1n f_i(x,y)$.
We prove SPIDER-GDA could find an $epsilon$-optimal solution within $mathcal Oleft((n + sqrtn,kappa_xkappa_y2)log (1/epsilon)
- Score: 12.459354707528819
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper considers stochastic first-order algorithms for minimax
optimization under Polyak--{\L}ojasiewicz (PL) conditions. We propose
SPIDER-GDA for solving the finite-sum problem of the form $\min_x \max_y
f(x,y)\triangleq \frac{1}{n} \sum_{i=1}^n f_i(x,y)$, where the objective
function $f(x,y)$ is $\mu_x$-PL in $x$ and $\mu_y$-PL in $y$; and each
$f_i(x,y)$ is $L$-smooth. We prove SPIDER-GDA could find an $\epsilon$-optimal
solution within ${\mathcal O}\left((n + \sqrt{n}\,\kappa_x\kappa_y^2)\log
(1/\epsilon)\right)$ stochastic first-order oracle (SFO) complexity, which is
better than the state-of-the-art method whose SFO upper bound is ${\mathcal
O}\big((n + n^{2/3}\kappa_x\kappa_y^2)\log (1/\epsilon)\big)$, where
$\kappa_x\triangleq L/\mu_x$ and $\kappa_y\triangleq L/\mu_y$. For the
ill-conditioned case, we provide an accelerated algorithm to reduce the
computational cost further. It achieves $\tilde{{\mathcal
O}}\big((n+\sqrt{n}\,\kappa_x\kappa_y)\log^2 (1/\epsilon)\big)$ SFO upper bound
when $\kappa_y \gtrsim \sqrt{n}$. Our ideas also can be applied to the more
general setting that the objective function only satisfies PL condition for one
variable. Numerical experiments validate the superiority of proposed methods.
Related papers
- Efficient Continual Finite-Sum Minimization [52.5238287567572]
We propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization.
Our approach significantly improves upon the $mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$ requires.
We also prove that there is no natural first-order method with $mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$, establishing that the first-order complexity of our method is nearly tight.
arXiv Detail & Related papers (2024-06-07T08:26:31Z) - On the Complexity of Finite-Sum Smooth Optimization under the
Polyak-{\L}ojasiewicz Condition [14.781921087738967]
This paper considers the optimization problem of the form $min_bf xinmathbb Rd f(bf x)triangleq frac1nsum_i=1n f_i(bf x)$, where $f(cdot)$ satisfies the Polyak--Lojasiewicz (PL) condition with parameter $mu$ and $f_i(cdot)_i=1n$ is $L$-mean-squared smooth.
arXiv Detail & Related papers (2024-02-04T17:14:53Z) - An Optimal Algorithm for Strongly Convex Min-min Optimization [79.11017157526815]
Existing optimal first-order methods require $mathcalO(sqrtmaxkappa_x,kappa_y log 1/epsilon)$ of computations of both $nabla_x f(x,y)$ and $nabla_y f(x,y)$.
We propose a new algorithm that only requires $mathcalO(sqrtkappa_x log 1/epsilon)$ of computations of $nabla_x f(x,
arXiv Detail & Related papers (2022-12-29T19:26:12Z) - The First Optimal Algorithm for Smooth and
Strongly-Convex-Strongly-Concave Minimax Optimization [88.91190483500932]
In this paper, we revisit the smooth and strongly-strongly-concave minimax optimization problem.
Existing state-of-the-art methods do not match lower bound $Omegaleft(sqrtkappa_xkappa_ylog3 (kappa_xkappa_y)logfrac1epsilonright)$.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft( sqrtkappa_xkappa_ylog
arXiv Detail & Related papers (2022-05-11T17:33:07Z) - Decentralized Stochastic Variance Reduced Extragradient Method [25.21457349137344]
This paper studies decentralized convex-concave minimax optimization problems of the form $min_xmax_y fx,y triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msumi=1m f_i triqfrac1msum
arXiv Detail & Related papers (2022-02-01T16:06:20Z) - Finding Second-Order Stationary Point for Nonconvex-Strongly-Concave
Minimax Problem [16.689304539024036]
In this paper, we consider non-asymotic behavior of finding second-order stationary point for minimax problem.
For high-dimensional problems, we propose anf to expensive cost form second-order oracle which solves the cubic sub-problem in gradient via descent and Chebyshev expansion.
arXiv Detail & Related papers (2021-10-10T14:54:23Z) - Near Optimal Stochastic Algorithms for Finite-Sum Unbalanced
Convex-Concave Minimax Optimization [41.432757205864796]
This paper considers first-order algorithms for convex-con minimax problems of the form $min_bf xmax_yf(bfbf y) simultaneously.
Our methods can be used to solve more general unbalanced minimax problems and are also near optimal.
arXiv Detail & Related papers (2021-06-03T11:30:32Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Near-Optimal Algorithms for Minimax Optimization [115.21519161773287]
The paper presents the first with $tildeO(sqrtkappa_mathbf xkappa_mathbf)$, matching the design on logarithmic factors.
The paper also presents algorithms that match or outperform all existing methods in these settings in terms of complexity.
arXiv Detail & Related papers (2020-02-05T16:49:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.