The First Optimal Acceleration of High-Order Methods in Smooth Convex
Optimization
- URL: http://arxiv.org/abs/2205.09647v1
- Date: Thu, 19 May 2022 16:04:40 GMT
- Title: The First Optimal Acceleration of High-Order Methods in Smooth Convex
Optimization
- Authors: Dmitry Kovalev, Alexander Gasnikov
- Abstract summary: We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems.
The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
- Score: 88.91190483500932
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we study the fundamental open question of finding the optimal
high-order algorithm for solving smooth convex minimization problems. Arjevani
et al. (2019) established the lower bound
$\Omega\left(\epsilon^{-2/(3p+1)}\right)$ on the number of the $p$-th order
oracle calls required by an algorithm to find an $\epsilon$-accurate solution
to the problem, where the $p$-th order oracle stands for the computation of the
objective function value and the derivatives up to the order $p$. However, the
existing state-of-the-art high-order methods of Gasnikov et al. (2019b); Bubeck
et al. (2019); Jiang et al. (2019) achieve the oracle complexity
$\mathcal{O}\left(\epsilon^{-2/(3p+1)} \log (1/\epsilon)\right)$, which does
not match the lower bound. The reason for this is that these algorithms require
performing a complex binary search procedure, which makes them neither optimal
nor practical. We fix this fundamental issue by providing the first algorithm
with $\mathcal{O}\left(\epsilon^{-2/(3p+1)}\right)$ $p$-th order oracle
complexity.
Related papers
- Second-Order Min-Max Optimization with Lazy Hessians [17.17389531402505]
This paper studies second-order methods for convex-concave minimax optimization.
We show that the computation cost can be reduced by Hessian across iterations.
arXiv Detail & Related papers (2024-10-12T15:30:17Z) - On the Complexity of First-Order Methods in Stochastic Bilevel
Optimization [9.649991673557167]
We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex.
Existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them.
We propose a simple first-order method that converges to an $epsilon$ stationary point using $O(epsilon-6), O(epsilon-4)$ access to first-order $y*$-aware oracles.
arXiv Detail & Related papers (2024-02-11T04:26:35Z) - The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem.
We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z) - Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems.
Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z) - Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully
First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods.
Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z) - The First Optimal Algorithm for Smooth and
Strongly-Convex-Strongly-Concave Minimax Optimization [88.91190483500932]
In this paper, we revisit the smooth and strongly-strongly-concave minimax optimization problem.
Existing state-of-the-art methods do not match lower bound $Omegaleft(sqrtkappa_xkappa_ylog3 (kappa_xkappa_y)logfrac1epsilonright)$.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft( sqrtkappa_xkappa_ylog
arXiv Detail & Related papers (2022-05-11T17:33:07Z) - Near-Optimal Lower Bounds For Convex Optimization For All Orders of
Smoothness [26.71898403195793]
We study the complexity of optimizing highly smooth convex functions.
For a positive integer $p$, we want to find an $epsilon$-approximate minimum of a convex function $f$.
We prove a new lower bound that matches this bound (up to log factors), and holds not only for randomized algorithms, but also for quantum algorithms.
arXiv Detail & Related papers (2021-12-02T10:51:43Z) - Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss [41.17536985461902]
We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors.
We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
arXiv Detail & Related papers (2021-05-04T21:49:15Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Second-order Conditional Gradient Sliding [79.66739383117232]
We present the emphSecond-Order Conditional Gradient Sliding (SOCGS) algorithm.
The SOCGS algorithm converges quadratically in primal gap after a finite number of linearly convergent iterations.
It is useful when the feasible region can only be accessed efficiently through a linear optimization oracle.
arXiv Detail & Related papers (2020-02-20T17:52:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.