Related papers: A Simple, Optimal and Efficient Algorithm for Online Exp-Concave Optimization

A Simple, Optimal and Efficient Algorithm for Online Exp-Concave Optimization

URL: http://arxiv.org/abs/2512.23190v1
Date: Mon, 29 Dec 2025 03:59:51 GMT
Title: A Simple, Optimal and Efficient Algorithm for Online Exp-Concave Optimization
Authors: Yi-Han Wang, Peng Zhao, Zhi-Hua Zhou,
Abstract summary: Online Newton Step (ONS) is a fundamental problem in online learning.<n>ONS faces a computational bottleneck due to the Mahalanobis projections at each round.<n>This paper proposes a simple variant of ONS, LightONS, which reduces the total runtime to $O(normd2 T + dsqrtTlog) while preserving an optimal regret.<n>This design enables LightONS to serve as an efficient plug-in replacement of ONS in broader scenarios.
Score: 72.1530234801628
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Online eXp-concave Optimization (OXO) is a fundamental problem in online learning. The standard algorithm, Online Newton Step (ONS), balances statistical optimality and computational practicality, guaranteeing an optimal regret of $O(d \log T)$, where $d$ is the dimension and $T$ is the time horizon. ONS faces a computational bottleneck due to the Mahalanobis projections at each round. This step costs $Ω(d^ω)$ arithmetic operations for bounded domains, even for the unit ball, where $ω\in (2,3]$ is the matrix-multiplication exponent. As a result, the total runtime can reach $\tilde{O}(d^ωT)$, particularly when iterates frequently oscillate near the domain boundary. For Stochastic eXp-concave Optimization (SXO), computational cost is also a challenge. Deploying ONS with online-to-batch conversion for SXO requires $T = \tilde{O}(d/ε)$ rounds to achieve an excess risk of $ε$, and thereby necessitates an $\tilde{O}(d^{ω+1}/ε)$ runtime. A COLT'13 open problem posed by Koren [2013] asks for an SXO algorithm with runtime less than $\tilde{O}(d^{ω+1}/ε)$. This paper proposes a simple variant of ONS, LightONS, which reduces the total runtime to $O(d^2 T + d^ω\sqrt{T \log T})$ while preserving the optimal $O(d \log T)$ regret. LightONS implies an SXO method with runtime $\tilde{O}(d^3/ε)$, thereby answering the open problem. Importantly, LightONS preserves the elegant structure of ONS by leveraging domain-conversion techniques from parameter-free online learning to introduce a hysteresis mechanism that delays expensive Mahalanobis projections until necessary. This design enables LightONS to serve as an efficient plug-in replacement of ONS in broader scenarios, even beyond regret minimization, including gradient-norm adaptive regret, parametric stochastic bandits, and memory-efficient online learning.

Related papers

Adaptivity and Universality: Problem-dependent Universal Regret for Online Convex Optimization [64.88607416000376]
We introduce UniGrad, a novel approach that achieves both universality and adaptivity, with two distinct realizations: UniGrad.Correct and UniGrad.Bregman.<n>Both methods achieve universal regret guarantees that adapt to gradient variation, simultaneously attaining $mathcalO(log V_T)$ regret for strongly convex functions and $mathcalO(d log V_T)$ regret for exp-concave functions.
arXiv Detail & Related papers (2025-11-25T05:23:10Z)
Optimal and Efficient Algorithms for Decentralized Online Convex Optimization [51.00357162913229]
Decentralized online convex optimization (D-OCO) is designed to minimize a sequence of global loss functions using only local computations and communications.<n>We develop a novel D-OCO algorithm that can reduce the regret bounds for convex and strongly convex functions to $tildeO(nrho-1/4sqrtT)$ and $tildeO(nrho-1/2log T)$.<n>Our analysis reveals that the projection-free variant can achieve $O(nT3/4)$ and $O(n
arXiv Detail & Related papers (2024-02-14T13:44:16Z)
Near-Optimal Algorithms for Private Online Optimization in the Realizable Regime [74.52487417350221]
We consider online learning problems in the realizable setting, where there is a zero-loss solution. We propose new Differentially Private (DP) algorithms that obtain near-optimal regret bounds.
arXiv Detail & Related papers (2023-02-27T21:19:24Z)
Projection-free Online Exp-concave Optimization [21.30065439295409]
We present an LOO-based ONS-style algorithm, which using overall $O(T)$ calls to a LOO, guarantees in worst case regret bounded by $widetildeO(n2/3T2/3)$. Our algorithm is most interesting in an important and plausible low-dimensional data scenario.
arXiv Detail & Related papers (2023-02-09T18:58:05Z)
Quasi-Newton Steps for Efficient Online Exp-Concave Optimization [10.492474737007722]
Online Newton Step (ONS) can guarantee a $O(dln T)$ bound on their regret after $T$ rounds. In this paper, we sidestep generalized projections by using a selfconcordant barrier as a regularizer. Our final algorithm can be viewed as a more efficient version of ONS.
arXiv Detail & Related papers (2022-11-02T17:57:21Z)
Online Convex Optimization with Continuous Switching Constraint [78.25064451417082]
We introduce the problem of online convex optimization with continuous switching constraint. We show that, for strongly convex functions, the regret bound can be improved to $O(log T)$ for $S=Omega(log T)$, and $O(minT/exp(S)+S,T)$ for $S=O(log T)$.
arXiv Detail & Related papers (2021-03-21T11:43:35Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure. We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds. We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.