Optimal anytime regret with two experts
- URL: http://arxiv.org/abs/2002.08994v2
- Date: Thu, 26 Aug 2021 21:19:03 GMT
- Title: Optimal anytime regret with two experts
- Authors: Nicholas J. A. Harvey, Christopher Liaw, Edwin Perkins, Sikander
Randhawa
- Abstract summary: We design the first minimax optimal algorithm for minimizing regret in the anytime setting.
The algorithm is designed by considering a continuous analogue of the regret problem, which is solved using ideas from calculus.
- Score: 11.959180302329358
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider the classical problem of prediction with expert advice. In the
fixed-time setting, where the time horizon is known in advance, algorithms that
achieve the optimal regret are known when there are two, three, or four experts
or when the number of experts is large. Much less is known about the problem in
the anytime setting, where the time horizon is not known in advance. No minimax
optimal algorithm was previously known in the anytime setting, regardless of
the number of experts. Even for the case of two experts, Luo and Schapire have
left open the problem of determining the optimal algorithm.
We design the first minimax optimal algorithm for minimizing regret in the
anytime setting. We consider the case of two experts, and prove that the
optimal regret is $\gamma \sqrt{t} / 2$ at all time steps $t$, where $\gamma$
is a natural constant that arose 35 years ago in studying fundamental
properties of Brownian motion. The algorithm is designed by considering a
continuous analogue of the regret problem, which is solved using ideas from
stochastic calculus.
Related papers
- On the price of exact truthfulness in incentive-compatible online learning with bandit feedback: A regret lower bound for WSU-UX [8.861995276194435]
In one view of the classical game of prediction with expert advice with binary outcomes, each expert maintains an adversarially chosen belief.
We consider a recently introduced, strategic variant of this problem with selfish (reputation-seeking) experts.
We show, via explicit construction of loss sequences, that the algorithm suffers a worst-case $Omega(T2/3)$ lower bound.
arXiv Detail & Related papers (2024-04-08T02:41:32Z) - Time-Varying Gaussian Process Bandits with Unknown Prior [18.93478528448966]
PE-GP-UCB is capable of solving time-varying Bayesian optimisation problems.
It relies on the fact that either the observed function values are consistent with some of the priors.
arXiv Detail & Related papers (2024-02-02T18:52:16Z) - Streaming Algorithms for Learning with Experts: Deterministic Versus
Robust [62.98860182111096]
In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times)
The goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set.
We show a space lower bound of $widetildeOmegaleft(fracnMRTright)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes.
arXiv Detail & Related papers (2023-03-03T04:39:53Z) - Optimal Tracking in Prediction with Expert Advice [0.0]
We study the prediction with expert advice setting, where the aim is to produce a decision by combining the decisions generated by a set of experts.
We achieve the min-max optimal dynamic regret under the prediction with expert advice setting.
Our algorithms are the first to produce such universally optimal, adaptive and truly online guarantees with no prior knowledge.
arXiv Detail & Related papers (2022-08-07T12:29:54Z) - Nearly Optimal Algorithms for Linear Contextual Bandits with Adversarial
Corruptions [98.75618795470524]
We study the linear contextual bandit problem in the presence of adversarial corruption.
We propose a new algorithm based on the principle of optimism in the face of uncertainty.
arXiv Detail & Related papers (2022-05-13T17:58:58Z) - Memory Bounds for the Experts Problem [53.67419690563877]
Online learning with expert advice is a fundamental problem of sequential prediction.
The goal is to process predictions, and make a prediction with the minimum cost.
An algorithm is judged by how well it does compared to the best expert in the set.
arXiv Detail & Related papers (2022-04-21T01:22:18Z) - Efficient and Optimal Fixed-Time Regret with Two Experts [5.650647159993238]
Prediction with expert advice is a foundational problem in online learning.
In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $sqrt(T/2)ln n$ regret when $T$ is known beforehand.
When the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist.
arXiv Detail & Related papers (2022-03-15T01:07:09Z) - Efficient and Optimal Algorithms for Contextual Dueling Bandits under
Realizability [59.81339109121384]
We study the $K$ contextual dueling bandit problem, a sequential decision making setting in which the learner uses contextual information to make two decisions, but only observes emphpreference-based feedback suggesting that one decision was better than the other.
We provide a new algorithm that achieves the optimal regret rate for a new notion of best response regret, which is a strictly stronger performance measure than those considered in prior works.
arXiv Detail & Related papers (2021-11-24T07:14:57Z) - Double Coverage with Machine-Learned Advice [100.23487145400833]
We study the fundamental online $k$-server problem in a learning-augmented setting.
We show that our algorithm achieves for any k an almost optimal consistency-robustness tradeoff.
arXiv Detail & Related papers (2021-03-02T11:04:33Z) - Excursion Search for Constrained Bayesian Optimization under a Limited
Budget of Failures [62.41541049302712]
We propose a novel decision maker grounded in control theory that controls the amount of risk we allow in the search as a function of a given budget of failures.
Our algorithm uses the failures budget more efficiently in a variety of optimization experiments, and generally achieves lower regret, than state-of-the-art methods.
arXiv Detail & Related papers (2020-05-15T09:54:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.