Revisiting Smoothed Online Learning
- URL: http://arxiv.org/abs/2102.06933v1
- Date: Sat, 13 Feb 2021 14:15:55 GMT
- Title: Revisiting Smoothed Online Learning
- Authors: Lijun Zhang, Wei Jiang, Shiyin Lu, Tianbao Yang
- Abstract summary: We investigate the problem of smoothed online learning, in which the online learner suffers both a hitting cost and a switching cost.
To bound the competitive ratio, we assume the hitting cost is known to the learner in each round, and investigate the greedy algorithm which simply minimizes the weighted sum of the hitting cost and the switching cost.
- Score: 70.09792747315323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we revisit the problem of smoothed online learning, in which
the online learner suffers both a hitting cost and a switching cost, and target
two performance metrics: competitive ratio and dynamic regret with switching
cost. To bound the competitive ratio, we assume the hitting cost is known to
the learner in each round, and investigate the greedy algorithm which simply
minimizes the weighted sum of the hitting cost and the switching cost. Our
theoretical analysis shows that the greedy algorithm, although straightforward,
is $1+ \frac{2}{\alpha}$-competitive for $\alpha$-polyhedral functions,
$1+O(\frac{1}{\lambda})$-competitive for $\lambda$-quadratic growth functions,
and $1 + \frac{2}{\sqrt{\lambda}}$-competitive for convex and
$\lambda$-quadratic growth functions. To bound the dynamic regret with
switching cost, we follow the standard setting of online convex optimization,
in which the hitting cost is convex but hidden from the learner before making
predictions. We modify Ader, an existing algorithm designed for dynamic regret,
slightly to take into account the switching cost when measuring the
performance. The proposed algorithm, named as Smoothed Ader, attains an optimal
$O(\sqrt{T(1+P_T)})$ bound for dynamic regret with switching cost, where $P_T$
is the path-length of the comparator sequence. Furthermore, if the hitting cost
is accessible in the beginning of each round, we obtain a similar guarantee
without the bounded gradient condition.
Related papers
- Nearly Minimax Optimal Regret for Learning Linear Mixture Stochastic
Shortest Path [80.60592344361073]
We study the Shortest Path (SSP) problem with a linear mixture transition kernel.
An agent repeatedly interacts with a environment and seeks to reach certain goal state while minimizing the cumulative cost.
Existing works often assume a strictly positive lower bound of the iteration cost function or an upper bound of the expected length for the optimal policy.
arXiv Detail & Related papers (2024-02-14T07:52:00Z) - Best of Both Worlds Guarantees for Smoothed Online Quadratic Optimization [9.449153668916098]
We study the smoothed online optimization (SOQO) problem where, at each round $t$, a player plays an action $x_t in response to a quadratic hitting cost and an additional squared $ell$-norm cost for switching actions.
This problem class has strong connections to a wide range of application domains including smart grid management, adaptive control, and data center management.
We present a best-of-both-worlds algorithm that obtains a robust adversarial performance while simultaneously achieving a near-optimal performance.
arXiv Detail & Related papers (2023-10-31T22:59:23Z) - Online Convex Optimization with Switching Cost and Delayed Gradients [7.903539618132858]
An online algorithm can choose its action using only gradient information about the previous objective function.
We show that the competitive ratio for the OCO problem with quadratic switching cost is at most $4(L + 5) + frac16(L + 5)mu$.
We conclude that the optimal competitive ratio for the quadratic and linear switching costs are fundamentally different in the limited information setting.
arXiv Detail & Related papers (2023-10-18T11:06:06Z) - Non-stationary Projection-free Online Learning with Dynamic and Adaptive
Regret Guarantees [36.746745619968024]
We investigate non-stationary projection-free online learning, and choose dynamic regret and adaptive regret to measure the performance.
Our results are the first general-case dynamic regret bounds for projection-free online learning, and can recover the existing $mathcalO(T3/4)$ static regret.
We propose a projection-free method to attain an $tildemathcalO(tau3/4)$ adaptive regret bound for any interval with length $tau.
arXiv Detail & Related papers (2023-05-19T15:02:10Z) - Optimal Stochastic Non-smooth Non-convex Optimization through
Online-to-Non-convex Conversion [56.92236659731376]
We present new algorithms for optimizing non-known, non-smooth objectives based on a novel analysis technique.
For deterministic second-order smooth objectives, applying advanced optimistic online learning techniques enables a new $O(delta0.5) all$ to recover optimal or best-known results.
arXiv Detail & Related papers (2023-02-07T22:09:20Z) - Smoothed Online Convex Optimization Based on Discounted-Normal-Predictor [68.17855675511602]
We investigate an online prediction strategy named as Discounted-Normal-Predictor (Kapralov and Panigrahy, 2010) for smoothed online convex optimization (SOCO)
We show that the proposed algorithm can minimize the adaptive regret with switching cost in every interval.
arXiv Detail & Related papers (2022-05-02T08:48:22Z) - Online Convex Optimization with Continuous Switching Constraint [78.25064451417082]
We introduce the problem of online convex optimization with continuous switching constraint.
We show that, for strongly convex functions, the regret bound can be improved to $O(log T)$ for $S=Omega(log T)$, and $O(minT/exp(S)+S,T)$ for $S=O(log T)$.
arXiv Detail & Related papers (2021-03-21T11:43:35Z) - Lazy OCO: Online Convex Optimization on a Switching Budget [34.936641201844054]
We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds.
Similar problems have been addressed in prior work for the discrete decision set setting, and more recently in the continuous setting but only with an adaptive adversary.
arXiv Detail & Related papers (2021-02-07T14:47:19Z) - Dynamic Regret of Convex and Smooth Functions [93.71361250701075]
We investigate online convex optimization in non-stationary environments.
We choose the dynamic regret as the performance measure.
We show that it is possible to further enhance the dynamic regret by exploiting the smoothness condition.
arXiv Detail & Related papers (2020-07-07T14:10:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.