Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss
- URL: http://arxiv.org/abs/2105.01778v1
- Date: Tue, 4 May 2021 21:49:15 GMT
- Title: Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss
- Authors: Yair Carmon, Arun Jambulapati, Yujia Jin, Aaron Sidford
- Abstract summary: We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors.
We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
- Score: 41.17536985461902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We characterize the complexity of minimizing $\max_{i\in[N]} f_i(x)$ for
convex, Lipschitz functions $f_1,\ldots, f_N$. For non-smooth functions,
existing methods require $O(N\epsilon^{-2})$ queries to a first-order oracle to
compute an $\epsilon$-suboptimal point and $\tilde{O}(N\epsilon^{-1})$ queries
if the $f_i$ are $O(1/\epsilon)$-smooth. We develop methods with improved
complexity bounds of $\tilde{O}(N\epsilon^{-2/3} + \epsilon^{-8/3})$ in the
non-smooth case and $\tilde{O}(N\epsilon^{-2/3} + \sqrt{N}\epsilon^{-1})$ in
the $O(1/\epsilon)$-smooth case. Our methods consist of a recently proposed
ball optimization oracle acceleration algorithm (which we refine) and a careful
implementation of said oracle for the softmax function. We also prove an oracle
complexity lower bound scaling as $\Omega(N\epsilon^{-2/3})$, showing that our
dependence on $N$ is optimal up to polylogarithmic factors.
Related papers
- Second-Order Min-Max Optimization with Lazy Hessians [17.17389531402505]
This paper studies second-order methods for convex-concave minimax optimization.
We show that the computation cost can be reduced by Hessian across iterations.
arXiv Detail & Related papers (2024-10-12T15:30:17Z) - Efficient Continual Finite-Sum Minimization [52.5238287567572]
We propose a key twist into the finite-sum minimization, dubbed as continual finite-sum minimization.
Our approach significantly improves upon the $mathcalO(n/epsilon)$ FOs that $mathrmStochasticGradientDescent$ requires.
We also prove that there is no natural first-order method with $mathcalOleft(n/epsilonalpharight)$ complexity gradient for $alpha 1/4$, establishing that the first-order complexity of our method is nearly tight.
arXiv Detail & Related papers (2024-06-07T08:26:31Z) - A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and
Minimizing the Maximum of Smooth Functions [44.655316553524855]
We design algorithms for minimizing $max_iin[n] f_i(x) over a $d$-dimensional Euclidean or simplex domain.
When each $f_i$ is $1$-Lipschitz and $1$-smooth, our method computes an $epsilon-approximate solution.
arXiv Detail & Related papers (2023-11-17T22:07:18Z) - Faster Stochastic Algorithms for Minimax Optimization under
Polyak--{\L}ojasiewicz Conditions [12.459354707528819]
We propose SPIDER-GDA for solving the finite-sum problem of the form $min_x max_y f(x,y)triqangle frac1n sum_i=1n f_i(x,y)$.
We prove SPIDER-GDA could find an $epsilon$-optimal solution within $mathcal Oleft((n + sqrtn,kappa_xkappa_y2)log (1/epsilon)
arXiv Detail & Related papers (2023-07-29T02:26:31Z) - An Optimal Algorithm for Strongly Convex Min-min Optimization [79.11017157526815]
Existing optimal first-order methods require $mathcalO(sqrtmaxkappa_x,kappa_y log 1/epsilon)$ of computations of both $nabla_x f(x,y)$ and $nabla_y f(x,y)$.
We propose a new algorithm that only requires $mathcalO(sqrtkappa_x log 1/epsilon)$ of computations of $nabla_x f(x,
arXiv Detail & Related papers (2022-12-29T19:26:12Z) - The First Optimal Acceleration of High-Order Methods in Smooth Convex
Optimization [88.91190483500932]
We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems.
The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
arXiv Detail & Related papers (2022-05-19T16:04:40Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - High-Order Oracle Complexity of Smooth and Strongly Convex Optimization [31.714928102950584]
We consider the complexity of optimizing a highly smooth (Lipschitz $k$-th order derivative) and strongly convex function, via calls to a $k$-th order oracle.
We prove that the worst-case oracle complexity for any fixed $k$ to optimize the function up to accuracy $epsilon$ is on the order of $left.
arXiv Detail & Related papers (2020-10-13T19:18:15Z) - On the Complexity of Minimizing Convex Finite Sums Without Using the
Indices of the Individual Functions [62.01594253618911]
We exploit the finite noise structure of finite sums to derive a matching $O(n2)$-upper bound under the global oracle model.
Following a similar approach, we propose a novel adaptation of SVRG which is both emphcompatible with oracles, and achieves complexity bounds of $tildeO(n2+nsqrtL/mu)log (1/epsilon)$ and $O(nsqrtL/epsilon)$, for $mu>0$ and $mu=0$
arXiv Detail & Related papers (2020-02-09T03:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.