Related papers: High-Order Oracle Complexity of Smooth and Strongly Convex Optimization

High-Order Oracle Complexity of Smooth and Strongly Convex Optimization

URL: http://arxiv.org/abs/2010.06642v2
Date: Wed, 28 Apr 2021 13:06:24 GMT
Title: High-Order Oracle Complexity of Smooth and Strongly Convex Optimization
Authors: Guy Kornowski, Ohad Shamir
Abstract summary: We consider the complexity of optimizing a highly smooth (Lipschitz $k$-th order derivative) and strongly convex function, via calls to a $k$-th order oracle. We prove that the worst-case oracle complexity for any fixed $k$ to optimize the function up to accuracy $epsilon$ is on the order of $left.
Score: 31.714928102950584
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this note, we consider the complexity of optimizing a highly smooth (Lipschitz $k$-th order derivative) and strongly convex function, via calls to a $k$-th order oracle which returns the value and first $k$ derivatives of the function at a given point, and where the dimension is unrestricted. Extending the techniques introduced in Arjevani et al. [2019], we prove that the worst-case oracle complexity for any fixed $k$ to optimize the function up to accuracy $\epsilon$ is on the order of $\left(\frac{\mu_k D^{k-1}}{\lambda}\right)^{\frac{2}{3k+1}}+\log\log\left(\frac{1}{\epsilon}\right)$ (in sufficiently high dimension, and up to log factors independent of $\epsilon$), where $\mu_k$ is the Lipschitz constant of the $k$-th derivative, $D$ is the initial distance to the optimum, and $\lambda$ is the strong convexity parameter.

Related papers

Non-Euclidean High-Order Smooth Convex Optimization [9.940728137241214]
We develop algorithms for the optimization of convex objectives that have H"older continuous $q$-th derivatives with respect to a $p$-norm. We can also optimize other structured functions.
arXiv Detail & Related papers (2024-11-13T19:22:34Z)
Quantum Algorithms for Non-smooth Non-convex Optimization [30.576546266390714]
This paper considers the problem for finding the $(,epsilon)$-Goldstein stationary point of Lipschitz continuous objective. We construct a zeroth-order quantum estimator for the surrogate oracle function.
arXiv Detail & Related papers (2024-10-21T16:52:26Z)
Tight Lower Bounds under Asymmetric High-Order Hölder Smoothness and Uniform Convexity [6.309677398331863]
We establish worst-case oracle complexities of $Omegaleft( left( fracHsigmaright)frac23(p+nu)$ in terms of the optimality gap.<n>Our results match the corresponding upper bounds in this general setting.
arXiv Detail & Related papers (2024-09-16T23:17:33Z)
Memory-Constrained Algorithms for Convex Optimization via Recursive Cutting-Planes [23.94542304111204]
First class of algorithms that provides a positive trade-off between gradient descent and cutting-plane methods in any regime with $epsilonleq 1/sqrt d$. In the regime $epsilon leq d-Omega(d)$, our algorithm with $p=d$ achieves the information-theoretic optimal memory usage and improves the oracle-complexity of gradient descent.
arXiv Detail & Related papers (2023-06-16T17:00:51Z)
An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum Optimization [25.21457349137344]
We show a proof to show DEAREST requires at most $mathcal O(+sqrtmnLvarepsilon-2)$ first-order oracle (IFO) calls and $mathcal O(Lvarepsilon-2/sqrt1-lambda_W)$ communication rounds.
arXiv Detail & Related papers (2022-10-25T11:37:11Z)
The First Optimal Acceleration of High-Order Methods in Smooth Convex Optimization [88.91190483500932]
We study the fundamental open question of finding the optimal high-order algorithm for solving smooth convex minimization problems. The reason for this is that these algorithms require performing a complex binary procedure, which makes them neither optimal nor practical. We fix this fundamental issue by providing the first algorithm with $mathcalOleft(epsilon-2/(p+1)right) $pth order oracle complexity.
arXiv Detail & Related papers (2022-05-19T16:04:40Z)
Near-Optimal Lower Bounds For Convex Optimization For All Orders of Smoothness [26.71898403195793]
We study the complexity of optimizing highly smooth convex functions. For a positive integer $p$, we want to find an $epsilon$-approximate minimum of a convex function $f$. We prove a new lower bound that matches this bound (up to log factors), and holds not only for randomized algorithms, but also for quantum algorithms.
arXiv Detail & Related papers (2021-12-02T10:51:43Z)
Thinking Inside the Ball: Near-Optimal Minimization of the Maximal Loss [41.17536985461902]
We prove an oracle complexity lower bound scaling as $Omega(Nepsilon-2/3)$, showing that our dependence on $N$ is optimal up to polylogarithmic factors. We develop methods with improved complexity bounds of $tildeO(Nepsilon-2/3 + sqrtNepsilon-8/3)$ in the non-smooth case and $tildeO(Nepsilon-2/3 + sqrtNepsilon-1)$ in
arXiv Detail & Related papers (2021-05-04T21:49:15Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Finding Global Minima via Kernel Approximations [90.42048080064849]
We consider the global minimization of smooth functions based solely on function evaluations. In this paper, we consider an approach that jointly models the function to approximate and finds a global minimum.
arXiv Detail & Related papers (2020-12-22T12:59:30Z)
Projection Efficient Subgradient Method and Optimal Nonsmooth Frank-Wolfe Method [54.93433440034386]
We find a feasible $epsilon$-suboptimal solution using only $O(epsilon-1)$ PO calls and optimal $O(epsilon-2)$ FO calls. Our experiments confirm that these methods achieve significant speedups over the state-of-the-art, for a problem with costly PO and LMO calls.
arXiv Detail & Related papers (2020-10-05T08:16:56Z)
An Optimal Separation of Randomized and Quantum Query Complexity [67.19751155411075]
We prove that for every decision tree, the absolute values of the Fourier coefficients of a given order $ellsqrtbinomdell (1+log n)ell-1,$ sum to at most $cellsqrtbinomdell (1+log n)ell-1,$ where $n$ is the number of variables, $d$ is the tree depth, and $c>0$ is an absolute constant.
arXiv Detail & Related papers (2020-08-24T06:50:57Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
Second-Order Information in Non-Convex Stochastic Optimization: Power and Limitations [54.42518331209581]
We find an algorithm which finds. epsilon$-approximate stationary point (with $|nabla F(x)|le epsilon$) using. $(epsilon,gamma)$surimate random random points. Our lower bounds here are novel even in the noiseless case.
arXiv Detail & Related papers (2020-06-24T04:41:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.