Fast Rates in Online Convex Optimization by Exploiting the Curvature of
Feasible Sets
- URL: http://arxiv.org/abs/2402.12868v1
- Date: Tue, 20 Feb 2024 09:59:33 GMT
- Title: Fast Rates in Online Convex Optimization by Exploiting the Curvature of
Feasible Sets
- Authors: Taira Tsuchiya, Shinji Ito
- Abstract summary: In online linear optimization, it is known that if the average gradient of loss functions is larger than a certain value, the curvature of feasible sets can be exploited.
This paper reveals that algorithms adaptive to the curvature of loss functions can also leverage the curvature of feasible sets.
- Score: 42.37773914630974
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we explore online convex optimization (OCO) and introduce a
new analysis that provides fast rates by exploiting the curvature of feasible
sets. In online linear optimization, it is known that if the average gradient
of loss functions is larger than a certain value, the curvature of feasible
sets can be exploited by the follow-the-leader (FTL) algorithm to achieve a
logarithmic regret. This paper reveals that algorithms adaptive to the
curvature of loss functions can also leverage the curvature of feasible sets.
We first prove that if an optimal decision is on the boundary of a feasible set
and the gradient of an underlying loss function is non-zero, then the algorithm
achieves a regret upper bound of $O(\rho \log T)$ in stochastic environments.
Here, $\rho > 0$ is the radius of the smallest sphere that includes the optimal
decision and encloses the feasible set. Our approach, unlike existing ones, can
work directly with convex loss functions, exploiting the curvature of loss
functions simultaneously, and can achieve the logarithmic regret only with a
local property of feasible sets. Additionally, it achieves an $O(\sqrt{T})$
regret even in adversarial environments where FTL suffers an $\Omega(T)$
regret, and attains an $O(\rho \log T + \sqrt{C \rho \log T})$ regret bound in
corrupted stochastic environments with corruption level $C$. Furthermore, by
extending our analysis, we establish a regret upper bound of
$O\Big(T^{\frac{q-2}{2(q-1)}} (\log T)^{\frac{q}{2(q-1)}}\Big)$ for
$q$-uniformly convex feasible sets, where uniformly convex sets include
strongly convex sets and $\ell_p$-balls for $p \in [1,\infty)$. This bound
bridges the gap between the $O(\log T)$ regret bound for strongly convex sets
($q=2$) and the $O(\sqrt{T})$ regret bound for non-curved sets ($q\to\infty$).
Related papers
- Online Convex Optimization with a Separation Oracle [10.225358400539719]
We introduce a new projection-free algorithm for Online Convex Optimization (OCO) with a state-of-the-art regret guarantee.
Our algorithm achieves a regret bound of $widetildeO(sqrtdT + kappa d)$, while requiring only $widetildeO(1) calls to a separation oracle per round.
arXiv Detail & Related papers (2024-10-03T13:35:08Z) - Two-Timescale Gradient Descent Ascent Algorithms for Nonconvex Minimax Optimization [77.3396841985172]
We provide a unified analysis of two-timescale gradient ascent (TTGDA) for solving structured non minimax optimization problems.
Our contribution is to design TTGDA algorithms are effective beyond the setting.
arXiv Detail & Related papers (2024-08-21T20:14:54Z) - Optimal and Efficient Algorithms for Decentralized Online Convex Optimization [51.00357162913229]
Decentralized online convex optimization (D-OCO) is designed to minimize a sequence of global loss functions using only local computations and communications.
We develop a novel D-OCO algorithm that can reduce the regret bounds for convex and strongly convex functions to $tildeO(nrho-1/4sqrtT)$ and $tildeO(nrho-1/2log T)$.
Our analysis reveals that the projection-free variant can achieve $O(nT3/4)$ and $O(n
arXiv Detail & Related papers (2024-02-14T13:44:16Z) - Breaking the Lower Bound with (Little) Structure: Acceleration in
Non-Convex Stochastic Optimization with Heavy-Tailed Noise [28.780192812703948]
We consider the optimization problem with smooth but not necessarily convex objectives in the heavy-tailed noise regime.
We show that one can achieve a faster rate than that dictated by the lower bound $Omega(Tfrac1-p3p-2)$ with only tiny bit of structure.
We also establish that it guarantees a high-probability convergence rate of $O(log(T/delta)Tfrac1-p3p-2)$ under a mild condition.
arXiv Detail & Related papers (2023-02-14T00:23:42Z) - Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD [38.221784575853796]
This work considers the problem of finding first-order stationary point of a non atau function with potentially constant smoothness using a gradient.
We develop a technique that allows us to prove $mathcalO(fracmathrmpolylog(T)sigmatT)$ convergence rates without assuming uniform bounds on the noise.
arXiv Detail & Related papers (2023-02-13T18:13:36Z) - Improved Dynamic Regret for Online Frank-Wolfe [54.690867216880356]
We investigate the dynamic regret of online Frank-Wolfe (OFW), which is an efficient projection-free algorithm for online convex optimization.
In this paper, we derive improved dynamic regret bounds for OFW by extending the fast convergence rates of FW from offline optimization to online optimization.
arXiv Detail & Related papers (2023-02-11T07:19:51Z) - Nearly Optimal Algorithms for Level Set Estimation [21.83736847203543]
We provide a new approach to the level set estimation problem by relating it to recent adaptive experimental design methods for linear bandits.
We show that our bounds are nearly optimal, namely, our upper bounds match existing lower bounds for threshold linear bandits.
arXiv Detail & Related papers (2021-11-02T17:45:02Z) - Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$
Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$
We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.