Related papers: Quasi-Newton Steps for Efficient Online Exp-Concave Optimization

Quasi-Newton Steps for Efficient Online Exp-Concave Optimization

URL: http://arxiv.org/abs/2211.01357v1
Date: Wed, 2 Nov 2022 17:57:21 GMT
Title: Quasi-Newton Steps for Efficient Online Exp-Concave Optimization
Authors: Zakaria Mhammedi and Khashayar Gatmiry
Abstract summary: Online Newton Step (ONS) can guarantee a $O(dln T)$ bound on their regret after $T$ rounds. In this paper, we sidestep generalized projections by using a selfconcordant barrier as a regularizer. Our final algorithm can be viewed as a more efficient version of ONS.
Score: 10.492474737007722
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The aim of this paper is to design computationally-efficient and optimal algorithms for the online and stochastic exp-concave optimization settings. Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set. However, such algorithms perform so-called generalized projections whenever their iterates step outside the feasible set. Such generalized projections require $\Omega(d^3)$ arithmetic operations even for simple sets such a Euclidean ball, making the total runtime of ONS of order $d^3 T$ after $T$ rounds, in the worst-case. In this paper, we side-step generalized projections by using a self-concordant barrier as a regularizer to compute the Newton steps. This ensures that the iterates are always within the feasible set without requiring projections. This approach still requires the computation of the inverse of the Hessian of the barrier at every step. However, using the stability properties of the Newton steps, we show that the inverse of the Hessians can be efficiently approximated via Taylor expansions for most rounds, resulting in a $O(d^2 T +d^\omega \sqrt{T})$ total computational complexity, where $\omega$ is the exponent of matrix multiplication. In the stochastic setting, we show that this translates into a $O(d^3/\epsilon)$ computational complexity for finding an $\epsilon$-suboptimal point, answering an open question by Koren 2013. We first show these new results for the simple case where the feasible set is a Euclidean ball. Then, to move to general convex set, we use a reduction to Online Convex Optimization over the Euclidean ball. Our final algorithm can be viewed as a more efficient version of ONS.

Related papers

Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability [85.07238533644636]
$Phi$-equilibria is a powerful and flexible framework at the heart of online learning and game theory. We show that an efficient online algorithm incurs average $Phi$-regret at most $epsilon$ using $textpoly(d, k)/epsilon2$ rounds. We also show nearly matching lower bounds in the online setting, thereby obtaining for the first time a family of deviations that captures the learnability of $Phi$-regret.
arXiv Detail & Related papers (2025-02-25T19:08:26Z)
Optimal and Efficient Algorithms for Decentralized Online Convex Optimization [51.00357162913229]
Decentralized online convex optimization (D-OCO) is designed to minimize a sequence of global loss functions using only local computations and communications. We develop a novel D-OCO algorithm that can reduce the regret bounds for convex and strongly convex functions to $tildeO(nrho-1/4sqrtT)$ and $tildeO(nrho-1/2log T)$. Our analysis reveals that the projection-free variant can achieve $O(nT3/4)$ and $O(n
arXiv Detail & Related papers (2024-02-14T13:44:16Z)
Dueling Optimization with a Monotone Adversary [35.850072415395395]
We study the problem of dueling optimization with a monotone adversary, which is a generalization of (noiseless) dueling convex optimization. The goal is to design an online algorithm to find a minimizer $mathbfx*$ for a function $fcolon X to mathbbRd.
arXiv Detail & Related papers (2023-11-18T23:55:59Z)
A Whole New Ball Game: A Primal Accelerated Method for Matrix Games and Minimizing the Maximum of Smooth Functions [44.655316553524855]
We design algorithms for minimizing $max_iin[n] f_i(x) over a $d$-dimensional Euclidean or simplex domain. When each $f_i$ is $1$-Lipschitz and $1$-smooth, our method computes an $epsilon-approximate solution.
arXiv Detail & Related papers (2023-11-17T22:07:18Z)
An Oblivious Stochastic Composite Optimization Algorithm for Eigenvalue Optimization Problems [76.2042837251496]
We introduce two oblivious mirror descent algorithms based on a complementary composite setting. Remarkably, both algorithms work without prior knowledge of the Lipschitz constant or smoothness of the objective function. We show how to extend our framework to scale and demonstrate the efficiency and robustness of our methods on large scale semidefinite programs.
arXiv Detail & Related papers (2023-06-30T08:34:29Z)
Projection-Free Online Convex Optimization via Efficient Newton Iterations [10.492474737007722]
This paper presents new projection-free algorithms for Online Convex Optimization (OCO) over a convex domain $mathcalK subset mathbbRd$.
arXiv Detail & Related papers (2023-06-19T18:48:53Z)
Projection-free Online Exp-concave Optimization [21.30065439295409]
We present an LOO-based ONS-style algorithm, which using overall $O(T)$ calls to a LOO, guarantees in worst case regret bounded by $widetildeO(n2/3T2/3)$. Our algorithm is most interesting in an important and plausible low-dimensional data scenario.
arXiv Detail & Related papers (2023-02-09T18:58:05Z)
A Newton-CG based barrier-augmented Lagrangian method for general nonconvex conic optimization [53.044526424637866]
In this paper we consider finding an approximate second-order stationary point (SOSP) that minimizes a twice different subject general non conic optimization. In particular, we propose a Newton-CG based-augmentedconjugate method for finding an approximate SOSP.
arXiv Detail & Related papers (2023-01-10T20:43:29Z)
ReSQueing Parallel and Private Stochastic Convex Optimization [59.53297063174519]
We introduce a new tool for BFG convex optimization (SCO): a Reweighted Query (ReSQue) estimator for the gradient of a function convolved with a (Gaussian) probability density. We develop algorithms achieving state-of-the-art complexities for SCO in parallel and private settings.
arXiv Detail & Related papers (2023-01-01T18:51:29Z)
Second-order optimization with lazy Hessians [55.51077907483634]
We analyze Newton's lazy Hessian updates for solving general possibly non-linear optimization problems. We reuse a previously seen Hessian iteration while computing new gradients at each step of the method.
arXiv Detail & Related papers (2022-12-01T18:58:26Z)
Exploiting the Curvature of Feasible Sets for Faster Projection-Free Online Learning [8.461907111368628]
We develop new efficient projection-free algorithms for Online Convex Optimization (OCO) We develop an OCO algorithm that makes two calls to an LO Oracle per round and achieves the near-optimal $widetildeO(sqrtT)$ regret. We also present an algorithm for general convex sets that makes $widetilde O(d)$ expected number of calls to an LO Oracle per round.
arXiv Detail & Related papers (2022-05-23T17:13:46Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure. We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds. We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.