Best-Case Lower Bounds in Online Learning
- URL: http://arxiv.org/abs/2106.12688v1
- Date: Wed, 23 Jun 2021 23:24:38 GMT
- Title: Best-Case Lower Bounds in Online Learning
- Authors: Crist\'obal Guzm\'an and Nishant A. Mehta and Ali Mortazavi
- Abstract summary: Much of the work in online learning focuses on the study of sublinear upper bounds on the regret.
In this work, we initiate the study of best-case lower bounds in online convex optimization.
We show that the linearized version of FTRL can attain negative linear regret.
- Score: 9.01310450044549
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Much of the work in online learning focuses on the study of sublinear upper
bounds on the regret. In this work, we initiate the study of best-case lower
bounds in online convex optimization, wherein we bound the largest improvement
an algorithm can obtain relative to the single best action in hindsight. This
problem is motivated by the goal of better understanding the adaptivity of a
learning algorithm. Another motivation comes from fairness: it is known that
best-case lower bounds are instrumental in obtaining algorithms for
decision-theoretic online learning (DTOL) that satisfy a notion of group
fairness. Our contributions are a general method to provide best-case lower
bounds in Follow The Regularized Leader (FTRL) algorithms with time-varying
regularizers, which we use to show that best-case lower bounds are of the same
order as existing upper regret bounds: this includes situations with a fixed
learning rate, decreasing learning rates, timeless methods, and adaptive
gradient methods. In stark contrast, we show that the linearized version of
FTRL can attain negative linear regret. Finally, in DTOL with two experts and
binary predictions, we fully characterize the best-case sequences, which
provides a finer understanding of the best-case lower bounds.
Related papers
- Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions.
Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms.
We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z) - Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds [59.875550175217874]
We show that a simple Model-based Reinforcement Learning scheme achieves strong regret and sample bounds in online and offline RL settings.
We highlight that our algorithms are simple, fairly standard, and indeed have been extensively studied in the RL literature.
arXiv Detail & Related papers (2024-08-16T19:52:53Z) - More Benefits of Being Distributional: Second-Order Bounds for
Reinforcement Learning [58.626683114119906]
We show that Distributional Reinforcement Learning (DistRL) can obtain second-order bounds in both online and offline RL.
Our results are the first second-order bounds for low-rank MDPs and for offline RL.
arXiv Detail & Related papers (2024-02-11T13:25:53Z) - Learning-Augmented Algorithms for Online Linear and Semidefinite
Programming [9.849604820019394]
Semidefinite programming (SDP) is a unifying framework that generalizes both linear and quadratically-constrained programming.
There exist known impossibility results for approxing the optimal solution when constraints for covering SDPs arrive in an online fashion.
We show that if the predictor is accurate, we can efficiently bypass these impossibility results and achieve a constant-factor approximation to the optimal solution.
arXiv Detail & Related papers (2022-09-21T19:16:29Z) - Precise Regret Bounds for Log-loss via a Truncated Bayesian Algorithm [14.834625066344582]
We study the sequential general online regression, known also as the sequential probability assignments, under logarithmic loss.
We focus on obtaining tight, often matching, lower and upper bounds for the sequential minimax regret that are defined as the excess loss it incurs over a class of experts.
arXiv Detail & Related papers (2022-05-07T22:03:00Z) - Smoothed Online Learning is as Easy as Statistical Learning [77.00766067963195]
We provide the first oracle-efficient, no-regret algorithms in this setting.
We show that if a function class is learnable in the classical setting, then there is an oracle-efficient, no-regret algorithm for contextual bandits.
arXiv Detail & Related papers (2022-02-09T19:22:34Z) - Beyond Value-Function Gaps: Improved Instance-Dependent Regret Bounds
for Episodic Reinforcement Learning [50.44564503645015]
We provide improved gap-dependent regret bounds for reinforcement learning in finite episodic Markov decision processes.
We prove tighter upper regret bounds for optimistic algorithms and accompany them with new information-theoretic lower bounds for a large class of MDPs.
arXiv Detail & Related papers (2021-07-02T20:36:05Z) - Lower Bounds on Cross-Entropy Loss in the Presence of Test-time
Adversaries [33.53470955144013]
In this paper, we determine optimal lower bounds on the cross-entropy loss in the presence of test-time adversaries, along with the corresponding optimal classification outputs.
We also propose and provide a proof of correctness for a bespoke algorithm to compute this lower bound efficiently.
We use our lower bounds as a diagnostic tool to determine the effectiveness of current robust training methods and find a gap from optimality at larger budgets.
arXiv Detail & Related papers (2021-04-16T21:41:28Z) - An Asymptotically Optimal Primal-Dual Incremental Algorithm for
Contextual Linear Bandits [129.1029690825929]
We introduce a novel algorithm improving over the state-of-the-art along multiple dimensions.
We establish minimax optimality for any learning horizon in the special case of non-contextual linear bandits.
arXiv Detail & Related papers (2020-10-23T09:12:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.