Online Strongly Convex Optimization with Unknown Delays
- URL: http://arxiv.org/abs/2103.11354v1
- Date: Sun, 21 Mar 2021 10:16:15 GMT
- Title: Online Strongly Convex Optimization with Unknown Delays
- Authors: Yuanyu Wan, Wei-Wei Tu, Lijun Zhang
- Abstract summary: We investigate the problem of online convex optimization with unknown delays.
We first extend the delayed variant of OGD for strongly convex functions.
We establish a better regret bound of $O(dlog T)$, where $d$ is the maximum delay.
- Score: 30.931538196386672
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We investigate the problem of online convex optimization with unknown delays,
in which the feedback of a decision arrives with an arbitrary delay. Previous
studies have presented a delayed variant of online gradient descent (OGD), and
achieved the regret bound of $O(\sqrt{T+D})$ by only utilizing the convexity
condition, where $D$ is the sum of delays over $T$ rounds. In this paper, we
further exploit the strong convexity to improve the regret bound. Specifically,
we first extend the delayed variant of OGD for strongly convex functions, and
establish a better regret bound of $O(d\log T)$, where $d$ is the maximum
delay. The essential idea is to let the learning rate decay with the total
number of received feedback linearly. Furthermore, we consider the more
challenging bandit setting, and obtain similar theoretical guarantees by
incorporating the classical multi-point gradient estimator into our extended
method. To the best of our knowledge, this is the first work that solves online
strongly convex optimization under the general delayed setting.
Related papers
- Stochastic Approximation with Delayed Updates: Finite-Time Rates under Markovian Sampling [73.5602474095954]
We study the non-asymptotic performance of approximation schemes with delayed updates under Markovian sampling.
Our theoretical findings shed light on the finite-time effects of delays for a broad class of algorithms.
arXiv Detail & Related papers (2024-02-19T03:08:02Z) - Improved Regret for Bandit Convex Optimization with Delayed Feedback [50.46856739179311]
bandit convex optimization (BCO) with delayed feedback, where only the loss value of the action is revealed under a delay.
We develop a novel algorithm, and prove that it enjoys a regret bound of $O(sqrtnT3/4+sqrtdT)$ in general.
We show that the proposed algorithm can improve the regret bound to $O((nT)2/3log/3T+dlog T)$ for strongly convex functions.
arXiv Detail & Related papers (2024-02-14T13:08:26Z) - Non-stationary Online Convex Optimization with Arbitrary Delays [50.46856739179311]
This paper investigates the delayed online convex optimization (OCO) in non-stationary environments.
We first propose a simple algorithm, namely DOGD, which performs a gradient descent step for each delayed gradient according to their arrival order.
We develop an improved algorithm, which reduces those dynamic regret bounds achieved by DOGD to $O(sqrtbardT(P_T+1))$.
arXiv Detail & Related papers (2023-05-20T07:54:07Z) - Improved Dynamic Regret for Online Frank-Wolfe [54.690867216880356]
We investigate the dynamic regret of online Frank-Wolfe (OFW), which is an efficient projection-free algorithm for online convex optimization.
In this paper, we derive improved dynamic regret bounds for OFW by extending the fast convergence rates of FW from offline optimization to online optimization.
arXiv Detail & Related papers (2023-02-11T07:19:51Z) - Online Convex Optimization with Stochastic Constraints: Zero Constraint
Violation and Bandit Feedback [0.0]
We propose a variant of the drift-plus-penalty algorithm that guarantees $O(sqrtT)$ expected regret and zero constraint violation.
Our algorithm is oblivious to the length of the time horizon $T$, in contrast to the vanilla drift-plus-penalty method.
arXiv Detail & Related papers (2023-01-26T18:04:26Z) - Procrastinated Tree Search: Black-box Optimization with Delayed, Noisy,
and Multi-fidelity Feedback [11.064341598231195]
In black-box optimization problems, we aim to maximize an unknown objective function, where the function is only accessible through feedbacks of an evaluation or simulation oracle.
We propose a generic extension of hierarchical optimistic tree search (HOO), called ProCrastinated Tree Search (PCTS)
We provide a generic proof technique to quantify regret of PCTS under delayed, noisy, and multi-fidelity feedbacks.
arXiv Detail & Related papers (2021-10-14T08:55:41Z) - Distributed stochastic optimization with large delays [59.95552973784946]
One of the most widely used methods for solving large-scale optimization problems is distributed asynchronous gradient descent (DASGD)
We show that DASGD converges to a global optimal implementation model under same delay assumptions.
arXiv Detail & Related papers (2021-07-06T21:59:49Z) - Lazy OCO: Online Convex Optimization on a Switching Budget [34.936641201844054]
We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds.
Similar problems have been addressed in prior work for the discrete decision set setting, and more recently in the continuous setting but only with an adaptive adversary.
arXiv Detail & Related papers (2021-02-07T14:47:19Z) - Projection-free Online Learning over Strongly Convex Sets [24.517908972536432]
We study the special case of online learning over strongly convex sets, for which we first prove that OFW can enjoy a better regret bound of $O(T2/3)$ for general convex losses.
We show that it achieves a regret bound of $O(sqrtT)$ over general convex sets and a better regret bound of $O(sqrtT)$ over strongly convex sets.
arXiv Detail & Related papers (2020-10-16T05:42:50Z) - Adapting to Delays and Data in Adversarial Multi-Armed Bandits [7.310043452300736]
We analyze variants of the Exp3 algorithm that tune their step-size using only information available at the time of the decisions.
We obtain regret guarantees that adapt to the observed (rather than the worst-case) sequences of delays and/or losses.
arXiv Detail & Related papers (2020-10-12T20:53:52Z) - Efficient Optimistic Exploration in Linear-Quadratic Regulators via
Lagrangian Relaxation [107.06364966905821]
We study the exploration-exploitation dilemma in the linear quadratic regulator (LQR) setting.
Inspired by the extended value iteration algorithm used in optimistic algorithms for finite MDPs, we propose to relax the optimistic optimization of ofulq.
We show that an $epsilon$-optimistic controller can be computed efficiently by solving at most $Obig(log (1/epsilon)big)$ Riccati equations.
arXiv Detail & Related papers (2020-07-13T16:30:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.