Partial Feedback Online Learning
- URL: http://arxiv.org/abs/2601.21462v2
- Date: Thu, 05 Feb 2026 02:57:08 GMT
- Title: Partial Feedback Online Learning
- Authors: Shihao Shao, Cong Fang, Zhouchen Lin, Dacheng Tao,
- Abstract summary: We study a new learning protocol, termed partial-feedback online learning.<n>Each instance admits a set of acceptable labels, but the learner observes only one acceptable label per round.
- Score: 88.27143767009376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study a new learning protocol, termed partial-feedback online learning, where each instance admits a set of acceptable labels, but the learner observes only one acceptable label per round. We highlight that, while classical version space is widely used for online learnability, it does not directly extend to this setting. We address this obstacle by introducing a collection version space, which maintains sets of hypotheses rather than individual hypotheses. Using this tool, we obtain a tight characterization of learnability in the set-realizable regime. In particular, we define the Partial-Feedback Littlestone dimension (PFLdim) and the Partial-Feedback Measure Shattering dimension (PMSdim), and show that they tightly characterize the minimax regret for deterministic and randomized learners, respectively. We further identify a nested inclusion condition under which deterministic and randomized learnability coincide, resolving an open question of Raman et al. (2024b). Finally, given a hypothesis space H, we show that beyond set realizability, the minimax regret can be linear even when |H|=2, highlighting a barrier beyond set realizability.
Related papers
- Characterizing Online and Private Learnability under Distributional Constraints via Generalized Smoothness [63.833913892018536]
We study sequential decision making under distributional adversaries that can adaptively choose data-generating distributions from a fixed family $U$.<n>We provide a near complete characterization of families $U$ that admit learnability in terms of a notion known as generalized smoothness.<n>We show that the generalized smoothness also characterizes private learnability under distributional constraints.
arXiv Detail & Related papers (2026-02-24T06:15:59Z) - Robust Online Learning [0.0]
We study the problem of learning robust classifiers where the classifier will receive a perturbed input.<n>We consider both the realizable and learnability of hypothesis classes.<n>We show it controls the mistake bounds in the realizable setting and the regret bounds in the agnostic setting.
arXiv Detail & Related papers (2026-02-06T15:30:52Z) - Proper Learnability and the Role of Unlabeled Data [10.168670899305232]
We show that there are problems whose proper learnability is logically undecidable, i.e., independent of the ZFC axioms.<n>We then show all impossibility results which obstruct any characterization of proper learnability in the realizable PAC model.
arXiv Detail & Related papers (2025-02-14T18:41:53Z) - Probably Approximately Precision and Recall Learning [60.00180898830079]
A key challenge in machine learning is the prevalence of one-sided feedback.<n>We introduce a Probably Approximately Correct (PAC) framework in which hypotheses are set functions that map each input to a set of labels.<n>We develop new algorithms that learn from positive data alone, achieving optimal sample complexity in the realizable case.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds [59.875550175217874]
We show that a simple Model-based Reinforcement Learning scheme achieves strong regret and sample bounds in online and offline RL settings.
We highlight that our algorithms are simple, fairly standard, and indeed have been extensively studied in the RL literature.
arXiv Detail & Related papers (2024-08-16T19:52:53Z) - Shrinking Class Space for Enhanced Certainty in Semi-Supervised Learning [59.44422468242455]
We propose a novel method dubbed ShrinkMatch to learn uncertain samples.
For each uncertain sample, it adaptively seeks a shrunk class space, which merely contains the original top-1 class.
We then impose a consistency regularization between a pair of strongly and weakly augmented samples in the shrunk space to strive for discriminative representations.
arXiv Detail & Related papers (2023-08-13T14:05:24Z) - A Combinatorial Characterization of Supervised Online Learnability [20.291598040396302]
We study the online learnability of hypothesis classes with respect to arbitrary, but bounded loss functions.
We give a new scale-sensitive dimension, named the sequential minimax dimension, and show that it gives a tight quantitative characterization of online learnability.
arXiv Detail & Related papers (2023-07-07T20:11:07Z) - Online Learning with Set-Valued Feedback [18.054632903107546]
We study a variant of online multiclass classification where the learner predicts a single label but receives a textitset of labels as feedback.
We show that unlike online multiclass learning with single-label feedback, deterministic and randomized online learnability are textitnot equivalent even in the realizable setting.
arXiv Detail & Related papers (2023-06-09T20:43:19Z) - Open-Set Likelihood Maximization for Few-Shot Learning [36.97433312193586]
We tackle the Few-Shot Open-Set Recognition (FSOSR) problem, i.e. classifying instances among a set of classes for which we only have a few labeled samples.
We explore the popular transductive setting, which leverages the unlabelled query instances at inference.
Motivated by the observation that existing transductive methods perform poorly in open-set scenarios, we propose a generalization of the maximum likelihood principle.
arXiv Detail & Related papers (2023-01-20T01:56:19Z) - Online Selective Classification with Limited Feedback [82.68009460301585]
We study selective classification in the online learning model, wherein a predictor may abstain from classifying an instance.
Two salient aspects of the setting we consider are that the data may be non-realisable, due to which abstention may be a valid long-term action.
We construct simple versioning-based schemes for any $mu in (0,1],$ that make most $Tmu$ mistakes while incurring smash$tildeO(T1-mu)$ excess abstention against adaptive adversaries.
arXiv Detail & Related papers (2021-10-27T08:00:53Z) - A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning.
Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension.
Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.