Efficient and Optimal Fixed-Time Regret with Two Experts
- URL: http://arxiv.org/abs/2203.07577v1
- Date: Tue, 15 Mar 2022 01:07:09 GMT
- Title: Efficient and Optimal Fixed-Time Regret with Two Experts
- Authors: Laura Greenstreet, Nicholas J. A. Harvey, Victor Sanches Portella
- Abstract summary: Prediction with expert advice is a foundational problem in online learning.
In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $sqrt(T/2)ln n$ regret when $T$ is known beforehand.
When the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist.
- Score: 5.650647159993238
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Prediction with expert advice is a foundational problem in online learning.
In instances with $T$ rounds and $n$ experts, the classical Multiplicative
Weights Update method suffers at most $\sqrt{(T/2)\ln n}$ regret when $T$ is
known beforehand. Moreover, this is asymptotically optimal when both $T$ and
$n$ grow to infinity. However, when the number of experts $n$ is small/fixed,
algorithms with better regret guarantees exist. Cover showed in 1967 a dynamic
programming algorithm for the two-experts problem restricted to $\{0,1\}$ costs
that suffers at most $\sqrt{T/2\pi} + O(1)$ regret with $O(T^2)$ pre-processing
time. In this work, we propose an optimal algorithm for prediction with two
experts' advice that works even for costs in $[0,1]$ and with $O(1)$ processing
time per turn. Our algorithm builds up on recent work on the experts problem
based on techniques and tools from stochastic calculus.
Related papers
- Streaming Algorithms for Learning with Experts: Deterministic Versus
Robust [62.98860182111096]
In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times)
The goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set.
We show a space lower bound of $widetildeOmegaleft(fracnMRTright)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes.
arXiv Detail & Related papers (2023-03-03T04:39:53Z) - Mind the gap: Achieving a super-Grover quantum speedup by jumping to the
end [114.3957763744719]
We present a quantum algorithm that has rigorous runtime guarantees for several families of binary optimization problems.
We show that the algorithm finds the optimal solution in time $O*(2(0.5-c)n)$ for an $n$-independent constant $c$.
We also show that for a large fraction of random instances from the $k$-spin model and for any fully satisfiable or slightly frustrated $k$-CSP formula, statement (a) is the case.
arXiv Detail & Related papers (2022-12-03T02:45:23Z) - Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model.
We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z) - Minimal Expected Regret in Linear Quadratic Control [79.81807680370677]
We devise an online learning algorithm and provide guarantees on its expected regret.
This regret at time $T$ is upper bounded (i) by $widetildeO((d_u+d_x)sqrtd_xT)$ when $A$ and $B$ are unknown.
arXiv Detail & Related papers (2021-09-29T14:07:21Z) - Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure.
We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds.
We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z) - Impossible Tuning Made Possible: A New Expert Algorithm and Its
Applications [37.6975819766632]
We show that it is possible to achieve regret $Oleft(sqrt(ln d)sum_t ell_t,i2right)$ simultaneously for all expert $i$ in a $T$-round $d$-expert problem.
Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer.
arXiv Detail & Related papers (2021-02-01T18:34:21Z) - Taking a hint: How to leverage loss predictors in contextual bandits? [63.546913998407405]
We study learning in contextual bandits with the help of loss predictors.
We show that the optimal regret is $mathcalO(minsqrtT, sqrtmathcalETfrac13)$ when $mathcalE$ is known.
arXiv Detail & Related papers (2020-03-04T07:36:38Z) - Optimal anytime regret with two experts [11.959180302329358]
We design the first minimax optimal algorithm for minimizing regret in the anytime setting.
The algorithm is designed by considering a continuous analogue of the regret problem, which is solved using ideas from calculus.
arXiv Detail & Related papers (2020-02-20T20:04:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.