Optimistically Tempered Online Learning
- URL: http://arxiv.org/abs/2301.07530v2
- Date: Wed, 14 Feb 2024 15:16:47 GMT
- Title: Optimistically Tempered Online Learning
- Authors: Maxime Haddouche and Olivier Wintenberger and Benjamin Guedj
- Abstract summary: Optimistic Online Learning algorithms exploit expert advices assumed optimistically to be always useful.
We develop the emphoptimistically tempered (OT) online learning framework as well as OT adaptations of online algorithms.
Our algorithms come with sound theoretical guarantees in the form of dynamic regret bounds.
- Score: 19.12634663761194
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Optimistic Online Learning algorithms have been developed to exploit expert
advices, assumed optimistically to be always useful. However, it is legitimate
to question the relevance of such advices \emph{w.r.t.} the learning
information provided by gradient-based online algorithms. In this work, we
challenge the confidence assumption on the expert and develop the
\emph{optimistically tempered} (OT) online learning framework as well as OT
adaptations of online algorithms. Our algorithms come with sound theoretical
guarantees in the form of dynamic regret bounds, and we eventually provide
experimental validation of the usefulness of the OT approach.
Related papers
- Learning-Augmented Algorithms for Online Concave Packing and Convex Covering Problems [4.9826534303287335]
We present learning-augmented algorithmic frameworks for two fundamental optimizations settings.
For online packing with concave objectives, we present a simple but overarching strategy that switches between the advice and the state-of-the-art online algorithm.
We show that our algorithms break impossibility results when the advice is accurate, while maintaining comparable performance with state-of-the-art classical online algorithms even when the advice is erroneous.
arXiv Detail & Related papers (2024-11-13T04:27:25Z) - Gradient-Variation Online Learning under Generalized Smoothness [56.38427425920781]
gradient-variation online learning aims to achieve regret guarantees that scale with variations in gradients of online functions.
Recent efforts in neural network optimization suggest a generalized smoothness condition, allowing smoothness to correlate with gradient norms.
We provide the applications for fast-rate convergence in games and extended adversarial optimization.
arXiv Detail & Related papers (2024-08-17T02:22:08Z) - A Simple Learning-Augmented Algorithm for Online Packing with Concave Objectives [4.9826534303287335]
We introduce and analyze a simple learning-augmented algorithm for online packing problems with linear constraints and concave objectives.
We further raise the problem of understanding necessary and sufficient conditions for when such simple black-box solutions may be optimal.
arXiv Detail & Related papers (2024-06-05T18:39:28Z) - RLIF: Interactive Imitation Learning as Reinforcement Learning [56.997263135104504]
We show how off-policy reinforcement learning can enable improved performance under assumptions that are similar but potentially even more practical than those of interactive imitation learning.
Our proposed method uses reinforcement learning with user intervention signals themselves as rewards.
This relaxes the assumption that intervening experts in interactive imitation learning should be near-optimal and enables the algorithm to learn behaviors that improve over the potential suboptimal human expert.
arXiv Detail & Related papers (2023-11-21T21:05:21Z) - Efficient Methods for Non-stationary Online Learning [67.3300478545554]
We present efficient methods for optimizing dynamic regret and adaptive regret, which reduce the number of projections per round from $mathcalO(log T)$ to $1$.
Our technique hinges on the reduction mechanism developed in parameter-free online learning and requires non-trivial twists on non-stationary online methods.
arXiv Detail & Related papers (2023-09-16T07:30:12Z) - Boosting for Online Convex Optimization [64.15578413206715]
We consider the decision-making framework of online convex optimization with a large number of experts.
We define a weak learning algorithm as a mechanism that guarantees approximate regret against a base class of experts.
We give an efficient boosting algorithm that guarantees near-optimal regret against the convex hull of the base class.
arXiv Detail & Related papers (2021-02-18T12:30:49Z) - Optimal Robustness-Consistency Trade-offs for Learning-Augmented Online
Algorithms [85.97516436641533]
We study the problem of improving the performance of online algorithms by incorporating machine-learned predictions.
The goal is to design algorithms that are both consistent and robust.
We provide the first set of non-trivial lower bounds for competitive analysis using machine-learned predictions.
arXiv Detail & Related papers (2020-10-22T04:51:01Z) - Meta-Gradient Reinforcement Learning with an Objective Discovered Online [54.15180335046361]
We propose an algorithm based on meta-gradient descent that discovers its own objective, flexibly parameterised by a deep neural network.
Because the objective is discovered online, it can adapt to changes over time.
On the Atari Learning Environment, the meta-gradient algorithm adapts over time to learn with greater efficiency.
arXiv Detail & Related papers (2020-07-16T16:17:09Z) - A Modern Introduction to Online Learning [15.974402990630402]
Online learning refers to the framework of minimization of regret under worst-case assumptions.
I present first-order and second-order algorithms for online learning with convex losses.
arXiv Detail & Related papers (2019-12-31T08:16:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.