Related papers: Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications

URL: http://arxiv.org/abs/2102.01046v1
Date: Mon, 1 Feb 2021 18:34:21 GMT
Title: Impossible Tuning Made Possible: A New Expert Algorithm and Its Applications
Authors: Liyu Chen, Haipeng Luo, Chen-Yu Wei
Abstract summary: We show that it is possible to achieve regret $Oleft(sqrt(ln d)sum_t ell_t,i2right)$ simultaneously for all expert $i$ in a $T$-round $d$-expert problem. Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer.
Score: 37.6975819766632
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We resolve the long-standing "impossible tuning" issue for the classic expert problem and show that, it is in fact possible to achieve regret $O\left(\sqrt{(\ln d)\sum_t \ell_{t,i}^2}\right)$ simultaneously for all expert $i$ in a $T$-round $d$-expert problem where $\ell_{t,i}$ is the loss for expert $i$ in round $t$. Our algorithm is based on the Mirror Descent framework with a correction term and a weighted entropy regularizer. While natural, the algorithm has not been studied before and requires a careful analysis. We also generalize the bound to $O\left(\sqrt{(\ln d)\sum_t (\ell_{t,i}-m_{t,i})^2}\right)$ for any prediction vector $m_t$ that the learner receives, and recover or improve many existing results by choosing different $m_t$. Furthermore, we use the same framework to create a master algorithm that combines a set of base algorithms and learns the best one with little overhead. The new guarantee of our master allows us to derive many new results for both the expert problem and more generally Online Linear Optimization.

Related papers

Sparsity-Based Interpolation of External, Internal and Swap Regret [4.753557469026313]
This paper studies the comparator of several performance metrics via $phi$-regret minimization. We present a single algorithm achieving the instance-adaptive $phi$-regret bound. Building on the classical reduction from $phi$-regret minimization to external regret minimization, our main idea is to further convert the latter to online linear regression.
arXiv Detail & Related papers (2025-02-06T22:47:52Z)
Streaming Algorithms for Learning with Experts: Deterministic Versus Robust [62.98860182111096]
In the online learning with experts problem, an algorithm must make a prediction about an outcome on each of $T$ days (or times) The goal is to make a prediction with the minimum cost, specifically compared to the best expert in the set. We show a space lower bound of $widetildeOmegaleft(fracnMRTright)$ for any deterministic algorithm that achieves regret $R$ when the best expert makes $M$ mistakes.
arXiv Detail & Related papers (2023-03-03T04:39:53Z)
Efficient and Optimal Fixed-Time Regret with Two Experts [5.650647159993238]
Prediction with expert advice is a foundational problem in online learning. In instances with $T$ rounds and $n$ experts, the classical Multiplicative Weights Update method suffers at most $sqrt(T/2)ln n$ regret when $T$ is known beforehand. When the number of experts $n$ is small/fixed, algorithms with better regret guarantees exist.
arXiv Detail & Related papers (2022-03-15T01:07:09Z)
Corralling a Larger Band of Bandits: A Case Study on Switching Regret for Linear Bandits [99.86860277006318]
We consider the problem of combining and learning over a set of adversarial algorithms with the goal of adaptively tracking the best one on the fly. The CORRAL of Agarwal et al. achieves this goal with a regret overhead of order $widetildeO(sqrtd S T)$ where $M$ is the number of base algorithms and $T$ is the time horizon. Motivated by this issue, we propose a new recipe to corral a larger band of bandit algorithms whose regret overhead has only emphlogarithmic dependence on $M$ as long
arXiv Detail & Related papers (2022-02-12T21:55:44Z)
Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model. We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z)
Private Stochastic Convex Optimization: Optimal Rates in $\ell_1$ Geometry [69.24618367447101]
Up to logarithmic factors the optimal excess population loss of any $(varepsilon,delta)$-differently private is $sqrtlog(d)/n + sqrtd/varepsilon n.$ We show that when the loss functions satisfy additional smoothness assumptions, the excess loss is upper bounded (up to logarithmic factors) by $sqrtlog(d)/n + (log(d)/varepsilon n)2/3.
arXiv Detail & Related papers (2021-03-02T06:53:44Z)
Optimal Regret Algorithm for Pseudo-1d Bandit Convex Optimization [51.23789922123412]
We study online learning with bandit feedback (i.e. learner has access to only zeroth-order oracle) where cost/reward functions admit a "pseudo-1d" structure. We show a lower bound of $min(sqrtdT, T3/4)$ for the regret of any algorithm, where $T$ is the number of rounds. We propose a new algorithm sbcalg that combines randomized online gradient descent with a kernelized exponential weights method to exploit the pseudo-1d structure effectively.
arXiv Detail & Related papers (2021-02-15T08:16:51Z)
Adaptive Online Learning with Varying Norms [45.11667443216861]
We provide an online convex optimization algorithm that outputs points $w_t$ in some domain $W$. We apply this result to obtain new "full-matrix"-style regret bounds.
arXiv Detail & Related papers (2020-02-10T17:22:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.