High-Order Oracle Complexity of Smooth and Strongly Convex Optimization
- URL: http://arxiv.org/abs/2010.06642v2
- Date: Wed, 28 Apr 2021 13:06:24 GMT
- Title: High-Order Oracle Complexity of Smooth and Strongly Convex Optimization
- Authors: Guy Kornowski, Ohad Shamir
- Abstract summary: We consider the complexity of optimizing a highly smooth (Lipschitz $k$-th order derivative) and strongly convex function, via calls to a $k$-th order oracle.
We prove that the worst-case oracle complexity for any fixed $k$ to optimize the function up to accuracy $epsilon$ is on the order of $left.
- Score: 31.714928102950584
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this note, we consider the complexity of optimizing a highly smooth
(Lipschitz $k$-th order derivative) and strongly convex function, via calls to
a $k$-th order oracle which returns the value and first $k$ derivatives of the
function at a given point, and where the dimension is unrestricted. Extending
the techniques introduced in Arjevani et al. [2019], we prove that the
worst-case oracle complexity for any fixed $k$ to optimize the function up to
accuracy $\epsilon$ is on the order of $\left(\frac{\mu_k
D^{k-1}}{\lambda}\right)^{\frac{2}{3k+1}}+\log\log\left(\frac{1}{\epsilon}\right)$
(in sufficiently high dimension, and up to log factors independent of
$\epsilon$), where $\mu_k$ is the Lipschitz constant of the $k$-th derivative,
$D$ is the initial distance to the optimum, and $\lambda$ is the strong
convexity parameter.
Related papers
- Quantum Algorithms for Non-smooth Non-convex Optimization [30.576546266390714]
This paper considers the problem for finding the $(,epsilon)$-Goldstein stationary point of Lipschitz continuous objective.
We construct a zeroth-order quantum estimator for the surrogate oracle function.
arXiv Detail & Related papers (2024-10-21T16:52:26Z) - An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum
Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z) - The First Optimal Acceleration of High-Order Methods in Smooth Convex
Optimization [88.91190483500932]
We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems.
The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical.
We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
arXiv Detail & Related papers (2022-05-19T16:04:40Z) - Near-Optimal Lower Bounds For Convex Optimization For All Orders of
Smoothness [26.71898403195793]
We study the complexity of optimizing highly smooth convex functions.
For a positive integer $p$, we want to find an $epsilon$-approximate minimum of a convex function $f$.
We prove a new lower bound that matches this bound (up to log factors), and holds not only for randomized algorithms, but also for quantum algorithms.
arXiv Detail & Related papers (2021-12-02T10:51:43Z) - Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss [41.17536985461902]
We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors.
We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
arXiv Detail & Related papers (2021-05-04T21:49:15Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations.
In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z) - Projection Efficient Subgradient Method and Optimal Nonsmooth
Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls.
Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Second-Order Information in Non-Convex Stochastic Optimization: Power
and Limitations [54.42518331209581]
We find an algorithm which finds.
epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using.
$(epsilon,gamma)$surimate random random points.
Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.