Related papers: Simple online learning with consistent oracle

Simple online learning with consistent oracle

URL: http://arxiv.org/abs/2308.08055v2
Date: Tue, 6 Feb 2024 20:04:38 GMT
Title: Simple online learning with consistent oracle
Authors: Alexander Kozachinskiy, Tomasz Steifer
Abstract summary: We consider online learning in the model where a learning algorithm can access the class only via the emphconsistent oracle -- an oracle, that, at any moment, can give a function from the class that agrees with all examples seen so far.
Score: 55.43220407902113
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider online learning in the model where a learning algorithm can access the class only via the \emph{consistent oracle} -- an oracle, that, at any moment, can give a function from the class that agrees with all examples seen so far. This model was recently considered by Assos et al.~(COLT'23). It is motivated by the fact that standard methods of online learning rely on computing the Littlestone dimension of subclasses, a computationally intractable problem. Assos et al.~gave an online learning algorithm in this model that makes at most $C^d$ mistakes on classes of Littlestone dimension $d$, for some absolute unspecified constant $C > 0$. We give a novel algorithm that makes at most $O(256^d)$ mistakes. Our proof is significantly simpler and uses only very basic properties of the Littlestone dimension. We also show that there exists no algorithm in this model that makes less than $3^d$ mistakes.

Related papers

Learning and Computation of $Φ$-Equilibria at the Frontier of Tractability [85.07238533644636]
$Phi$-equilibria is a powerful and flexible framework at the heart of online learning and game theory. We show that an efficient online algorithm incurs average $Phi$-regret at most $epsilon$ using $textpoly(d, k)/epsilon2$ rounds. We also show nearly matching lower bounds in the online setting, thereby obtaining for the first time a family of deviations that captures the learnability of $Phi$-regret.
arXiv Detail & Related papers (2025-02-25T19:08:26Z)
Decoupling Learning and Decision-Making: Breaking the $\mathcal{O}(\sqrt{T})$ Barrier in Online Resource Allocation with First-Order Methods [7.518108920887426]
We introduce a new algorithmic framework that decouples learning from decision-making. For the first time, we show that first-order methods can attain regret $mathcalO(sqrtT)$ with this new framework.
arXiv Detail & Related papers (2024-02-11T05:35:50Z)
A Trichotomy for Transductive Online Learning [32.03948071550447]
We present new upper and lower bounds on the number of learner mistakes in the transductive' online learning setting of Ben-David, Kushilevitz and Mansour (1997). This setting is similar to standard online learning, except that the adversary fixes a sequence of instances to be labeled at the start of the game, and this sequence is known to the learner.
arXiv Detail & Related papers (2023-11-10T23:27:23Z)
Tight Time-Space Lower Bounds for Constant-Pass Learning [1.7387739961208148]
We prove tight memory-sample lower bounds for any parity learning algorithm that makes $q$ passes over the stream of samples. We show that such a learner requires either $Omega(n2)$ memory size or at least $2Omega(n)$ samples. Similar to prior work, our results extend to any learning problem with many nearly-orthogonal concepts.
arXiv Detail & Related papers (2023-10-12T06:36:31Z)
Oracle-Efficient Online Learning for Beyond Worst-Case Adversaries [29.598518028635716]
We study oracle-efficient algorithms for beyond worst-case analysis of online learning. For the smoothed analysis setting, our results give the first oracle-efficient algorithm for online learning with smoothed adversaries.
arXiv Detail & Related papers (2022-02-17T09:49:40Z)
Logarithmic Regret from Sublinear Hints [76.87432703516942]
We show that an algorithm can obtain $O(log T)$ regret with just $O(sqrtT)$ hints under a natural query model. We also show that $o(sqrtT)$ hints cannot guarantee better than $Omega(sqrtT)$ regret.
arXiv Detail & Related papers (2021-11-09T16:50:18Z)
Littlestone Classes are Privately Online Learnable [28.04975353867202]
We consider the problem of online classification under a privacy constraint. In this setting a learner observes sequentially a stream of labelled examples $(x_t, y_t)$, for $1 leq t leq T$, and returns at each iteration a hypothesis $h_t$ which is used to predict the label of each new example $x_t$. The learner's performance is measured by her regret against a known hypothesis class $mathcalH$.
arXiv Detail & Related papers (2021-06-25T09:08:33Z)
Contextual Recommendations and Low-Regret Cutting-Plane Algorithms [49.91214213074933]
We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We design novel cutting-plane algorithms with low "regret" -- the total distance between the true point $w*$ and the hyperplanes the separation oracle returns.
arXiv Detail & Related papers (2021-06-09T05:39:05Z)
Learning a Latent Simplex in Input-Sparsity Time [58.30321592603066]
We consider the problem of learning a latent $k$-vertex simplex $KsubsetmathbbRdtimes n$, given access to $AinmathbbRdtimes n$. We show that the dependence on $k$ in the running time is unnecessary given a natural assumption about the mass of the top $k$ singular values of $A$.
arXiv Detail & Related papers (2021-05-17T16:40:48Z)
Provably Efficient Reinforcement Learning with Linear Function Approximation Under Adaptivity Constraints [94.76881135901753]
We consider two popular limited adaptivity models: batch learning model and rare policy switch model. Our proposed LSVI-UCB-Batch algorithm achieves an $tilde O(sqrtd3H3T + dHT/B)$ regret. For the rare policy switch model, our proposed LSVI-UCB-RareSwitch algorithm enjoys an $tilde O(sqrtd3H3T[1+T/(dH)]dH/B)$ regret.
arXiv Detail & Related papers (2021-01-06T18:56:07Z)
Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning [66.05472746340142]
This paper analyzes how multi-layer neural networks can perform hierarchical learning _efficiently_ and _automatically_ by SGD on the training objective. We establish a new principle called "backward feature correction", where the errors in the lower-level features can be automatically corrected when training together with the higher-level layers.
arXiv Detail & Related papers (2020-01-13T17:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.