Related papers: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization

The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization

URL: http://arxiv.org/abs/2205.09647v1
Date: Thu, 19 May 2022 16:04:40 GMT
Title: The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization
Authors: Dmitry Kovalev, Alexander Gasnikov
Abstract summary: We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems. The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical. We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
Score: 88.91190483500932
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems. Arjevani et al. (2019) established the lower bound $\Omega\left(\epsilon^{-2/(3p+1)}\right)$ on the number of the $p$-th order oracle calls required by an algorithm to find an $\epsilon$-accurate solution to the problem, where the $p$-th order oracle stands for the computation of the objective function value and the derivatives up to the order $p$. However, the existing state-of-the-art high-order methods of Gasnikov et al. (2019b); Bubeck et al. (2019); Jiang et al. (2019) achieve the oracle complexity $\mathcal{O}\left(\epsilon^{-2/(3p+1)} \log (1/\epsilon)\right)$, which does not match the lower bound. The reason for this is that these algorithms require performing a complex binary search procedure, which makes them neither optimal nor practical. We fix this fundamental issue by providing the first algorithm with $\mathcal{O}\left(\epsilon^{-2/(3p+1)}\right)$ $p$-th order oracle complexity.

Related papers

Second-Order Min-Max Optimization with Lazy Hessians [17.17389531402505]
This paper studies second-order methods for convex-concave minimax optimization. We show that the computation cost can be reduced by Hessian across iterations.
arXiv Detail & Related papers (2024-10-12T15:30:17Z)
On the Complexity of First-Order Methods in Stochastic Bilevel Optimization [9.649991673557167]
We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex. Existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them. We propose a simple first-order method that converges to an $epsilon$ stationary point using $O(epsilon-6), O(epsilon-4)$ access to first-order $y*$-aware oracles.
arXiv Detail & Related papers (2024-02-11T04:26:35Z)
The Computational Complexity of Finding Stationary Points in Non-Convex Optimization [53.86485757442486]
Finding approximate stationary points, i.e., points where the gradient is approximately zero, of non-order but smooth objective functions is a computational problem. We show that finding approximate KKT points in constrained optimization is reducible to finding approximate stationary points in unconstrained optimization but the converse is impossible.
arXiv Detail & Related papers (2023-10-13T14:52:46Z)
Accelerating Inexact HyperGradient Descent for Bilevel Optimization [84.00488779515206]
We present a method for solving general non-strongly-concave bilevel optimization problems. Our results also improve upon the existing complexity for finding second-order stationary points in non-strongly-concave problems.
arXiv Detail & Related papers (2023-06-30T20:36:44Z)
Near-Optimal Nonconvex-Strongly-Convex Bilevel Optimization with Fully First-Order Oracles [14.697733592222658]
We show that a first-order method can converge at the near-optimal $tilde mathcalO(epsilon-2)$ rate as second-order methods. Our analysis further leads to simple first-order algorithms that achieve similar convergence rates for finding second-order stationary points.
arXiv Detail & Related papers (2023-06-26T17:07:54Z)
The First Optimal Algorithm for Smooth and Strongly-Convex-Strongly-Concave Minimax Optimization [88.91190483500932]
In this paper, we revisit the smooth and strongly-strongly-concave minimax optimization problem. Existing state-of-the-art methods do not match lower bound $Omegaleft(sqrtkappa_xkappa_ylog3 (kappa_xkappa_y)logfrac1epsilonright)$. We fix this fundamental issue by providing the first algorithm with $mathcalOleft( sqrtkappa_xkappa_ylog
arXiv Detail & Related papers (2022-05-11T17:33:07Z)
Near-Optimal Lower Bounds For Convex Optimization For All Orders of Smoothness [26.71898403195793]
We study the complexity of optimizing highly smooth convex functions. For a positive integer $p$, we want to find an $epsilon$-approximate minimum of a convex function $f$. We prove a new lower bound that matches this bound (up to log factors), and holds not only for randomized algorithms, but also for quantum algorithms.
arXiv Detail & Related papers (2021-12-02T10:51:43Z)
Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss [41.17536985461902]
We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors. We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
arXiv Detail & Related papers (2021-05-04T21:49:15Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
Second-order Conditional Gradient Sliding [79.66739383117232]
We present the emphSecond-Order Conditional Gradient Sliding (SOCGS) algorithm. The SOCGS algorithm converges quadratically in primal gap after a finite number of linearly convergent iterations. It is useful when the feasible region can only be accessed efficiently through a linear optimization oracle.
arXiv Detail & Related papers (2020-02-20T17:52:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.