A Trichotomy for Transductive Online Learning
- URL: http://arxiv.org/abs/2311.06428v2
- Date: Wed, 29 Nov 2023 23:37:43 GMT
- Title: A Trichotomy for Transductive Online Learning
- Authors: Steve Hanneke, Shay Moran, Jonathan Shafer
- Abstract summary: We present new upper and lower bounds on the number of learner mistakes in the transductive' online learning setting of Ben-David, Kushilevitz and Mansour (1997).
This setting is similar to standard online learning, except that the adversary fixes a sequence of instances to be labeled at the start of the game, and this sequence is known to the learner.
- Score: 32.03948071550447
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present new upper and lower bounds on the number of learner mistakes in
the `transductive' online learning setting of Ben-David, Kushilevitz and
Mansour (1997). This setting is similar to standard online learning, except
that the adversary fixes a sequence of instances $x_1,\dots,x_n$ to be labeled
at the start of the game, and this sequence is known to the learner.
Qualitatively, we prove a trichotomy, stating that the minimal number of
mistakes made by the learner as $n$ grows can take only one of precisely three
possible values: $n$, $\Theta\left(\log (n)\right)$, or $\Theta(1)$.
Furthermore, this behavior is determined by a combination of the VC dimension
and the Littlestone dimension. Quantitatively, we show a variety of bounds
relating the number of mistakes to well-known combinatorial dimensions. In
particular, we improve the known lower bound on the constant in the $\Theta(1)$
case from $\Omega\left(\sqrt{\log(d)}\right)$ to $\Omega(\log(d))$ where $d$ is
the Littlestone dimension. Finally, we extend our results to cover multiclass
classification and the agnostic setting.
Related papers
- Online Learning of Halfspaces with Massart Noise [47.71073318490341]
We study the task of online learning in the presence of Massart noise.
We present a computationally efficient algorithm that achieves mistake bound $eta T + o(T)$.
We use our Massart online learner to design an efficient bandit algorithm that obtains expected reward at least $(1-1/k) Delta T - o(T)$ bigger than choosing a random action at every round.
arXiv Detail & Related papers (2024-05-21T17:31:10Z) - On the Growth of Mistakes in Differentially Private Online Learning: A Lower Bound Perspective [8.104151304193216]
We provide lower bounds for Differentially Private (DP) Online Learning algorithms.
Our work is the first result towards settling lower bounds for DP-Online learning.
arXiv Detail & Related papers (2024-02-26T17:49:37Z) - Apple Tasting: Combinatorial Dimensions and Minimax Rates [16.52706654413988]
In online binary classification under emphapple tasting feedback, the learner only observes the true label if it predicts 1"
We show that in the realizable setting, the expected number of mistakes can be $Theta$ or $Theta(T)$.
arXiv Detail & Related papers (2023-10-29T16:37:51Z) - Simple online learning with consistent oracle [55.43220407902113]
We consider online learning in the model where a learning algorithm can access the class only via the emphconsistent oracle -- an oracle, that, at any moment, can give a function from the class that agrees with all examples seen so far.
arXiv Detail & Related papers (2023-08-15T21:50:40Z) - Self-Directed Linear Classification [50.659479930171585]
In online classification, a learner aims to predict their labels in an online fashion so as to minimize the total number of mistakes.
Here we study the power of choosing the prediction order and establish the first strong separation between worst-order and random-order learning.
arXiv Detail & Related papers (2023-08-06T15:38:44Z) - On Optimal Learning Under Targeted Data Poisoning [48.907813854832206]
In this work we aim to characterize the smallest achievable error $epsilon=epsilon(eta)$ by the learner in the presence of such an adversary.
Remarkably, we show that the upper bound can be attained by a deterministic learner.
arXiv Detail & Related papers (2022-10-06T06:49:48Z) - Fast Rates for Nonparametric Online Learning: From Realizability to
Learning in Games [36.969021834291745]
We propose a proper learning algorithm which gets a near-optimal mistake bound in terms of the sequential fat-shattering dimension of the hypothesis class.
This result answers a question as to whether proper learners could achieve near-optimal mistake bounds.
For the real-valued (regression) setting, the optimal mistake bound was not even known for improper learners.
arXiv Detail & Related papers (2021-11-17T05:24:21Z) - Littlestone Classes are Privately Online Learnable [28.04975353867202]
We consider the problem of online classification under a privacy constraint.
In this setting a learner observes sequentially a stream of labelled examples $(x_t, y_t)$, for $1 leq t leq T$, and returns at each iteration a hypothesis $h_t$ which is used to predict the label of each new example $x_t$.
The learner's performance is measured by her regret against a known hypothesis class $mathcalH$.
arXiv Detail & Related papers (2021-06-25T09:08:33Z) - Variance-Aware Confidence Set: Variance-Dependent Bound for Linear
Bandits and Horizon-Free Bound for Linear Mixture MDP [76.94328400919836]
We show how to construct variance-aware confidence sets for linear bandits and linear mixture Decision Process (MDP)
For linear bandits, we obtain an $widetildeO(mathrmpoly(d)sqrt1 + sum_i=1Ksigma_i2) regret bound, where $d is the feature dimension.
For linear mixture MDP, we obtain an $widetildeO(mathrmpoly(d)sqrtK)$ regret bound, where
arXiv Detail & Related papers (2021-01-29T18:57:52Z) - $Q$-learning with Logarithmic Regret [60.24952657636464]
We prove that an optimistic $Q$-learning enjoys a $mathcalOleft(fracSAcdot mathrmpolyleft(Hright)Delta_minlogleft(SATright)right)$ cumulative regret bound, where $S$ is the number of states, $A$ is the number of actions, $H$ is the planning horizon, $T$ is the total number of steps, and $Delta_min$ is the minimum sub-optimality gap.
arXiv Detail & Related papers (2020-06-16T13:01:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.