Universal Rates for Multiclass Learning
- URL: http://arxiv.org/abs/2307.02066v1
- Date: Wed, 5 Jul 2023 07:12:58 GMT
- Title: Universal Rates for Multiclass Learning
- Authors: Steve Hanneke, Shay Moran, Qian Zhang
- Abstract summary: We study universal rates for multiclass classification, establishing the optimal rates (up to log factors) for all hypothesis classes.
We also resolve an open question regarding the equivalence of classes having infinite Graph-Littlestone (GL) trees versus infinite Natarajan-Littlestone (NL) trees.
- Score: 28.18556410009232
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study universal rates for multiclass classification, establishing the
optimal rates (up to log factors) for all hypothesis classes. This generalizes
previous results on binary classification (Bousquet, Hanneke, Moran, van
Handel, and Yehudayoff, 2021), and resolves an open question studied by
Kalavasis, Velegkas, and Karbasi (2022) who handled the multiclass setting with
a bounded number of class labels. In contrast, our result applies for any
countable label space. Even for finite label space, our proofs provide a more
precise bounds on the learning curves, as they do not depend on the number of
labels. Specifically, we show that any class admits exponential rates if and
only if it has no infinite Littlestone tree, and admits (near-)linear rates if
and only if it has no infinite Daniely-Shalev-Shwartz-Littleston (DSL) tree,
and otherwise requires arbitrarily slow rates. DSL trees are a new structure we
define in this work, in which each node of the tree is given by a pseudo-cube
of possible classifications of a given set of points. Pseudo-cubes are a
structure, rooted in the work of Daniely and Shalev-Shwartz (2014), and
recently shown by Brukhim, Carmon, Dinur, Moran, and Yehudayoff (2022) to
characterize PAC learnability (i.e., uniform rates) for multiclass
classification. We also resolve an open question of Kalavasis, Velegkas, and
Karbasi (2022) regarding the equivalence of classes having infinite
Graph-Littlestone (GL) trees versus infinite Natarajan-Littlestone (NL) trees,
showing that they are indeed equivalent.
Related papers
- Harnessing Superclasses for Learning from Hierarchical Databases [1.835004446596942]
In many large-scale classification problems, classes are organized in a known hierarchy, typically represented as a tree.
We introduce a loss for this type of supervised hierarchical classification.
Our approach does not entail any significant additional computational cost compared with the loss of cross-entropy.
arXiv Detail & Related papers (2024-11-25T14:39:52Z) - Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem [26.292576184028924]
This work continues to investigate the link between differentially private (DP) and online learning.
We show that for general classification tasks, DP learnability implies online learnability.
arXiv Detail & Related papers (2024-07-10T15:43:30Z) - The Real Price of Bandit Information in Multiclass Classification [73.17969992976501]
We revisit the classical problem of multiclass classification with bandit feedback.
We present a new bandit classification algorithm that guarantees regret $smashwidetildeO(|H|+sqrtT)$.
arXiv Detail & Related papers (2024-05-16T12:11:09Z) - Contrastive Meta-Learning for Few-shot Node Classification [54.36506013228169]
Few-shot node classification aims to predict labels for nodes on graphs with only limited labeled nodes as references.
We create a novel contrastive meta-learning framework on graphs, named COSMIC, with two key designs.
arXiv Detail & Related papers (2023-06-27T02:22:45Z) - GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node
Classification [64.85392028383164]
Class imbalance is the phenomenon that some classes have much fewer instances than others.
Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples.
We propose a general framework GraphSHA by Synthesizing HArder minor samples.
arXiv Detail & Related papers (2023-06-16T04:05:58Z) - Multiclass Online Learning and Uniform Convergence [34.21248304961989]
We study multiclass classification in the adversarial online learning setting.
We prove that any multiclass concept class is agnostically learnable if and only if its Littlestone dimension is finite.
arXiv Detail & Related papers (2023-03-30T21:35:48Z) - Multiclass Learnability Beyond the PAC Framework: Universal Rates and
Partial Concept Classes [31.2676304636432]
We study the problem of multiclass classification with a bounded number of different labels $k$, in the realizable setting.
We extend the traditional PAC model to a) distribution-dependent learning rates, and b) learning rates under data-dependent assumptions.
arXiv Detail & Related papers (2022-10-05T14:36:27Z) - A Characterization of Multiclass Learnability [18.38631912121182]
We characterize multiclass PAC learnability through the DS dimension, a dimension defined by Daniely and Shalev-Shwartz 2014.
In the list learning setting, instead of predicting a single outcome for a given unseen input, the goal is to provide a short menu of predictions.
Our second main result concerns the Natarajan dimension, which has been a central candidate for characterizing multiclass learnability.
arXiv Detail & Related papers (2022-03-03T07:41:54Z) - Monotone Learning [86.77705135626186]
We show that every learning algorithm A can be transformed to a monotone class with similar performance.
This demonstrates that one can provably avoid non-monotonic behaviour without compromising performance.
arXiv Detail & Related papers (2022-02-10T18:51:57Z) - Sample-efficient proper PAC learning with approximate differential
privacy [51.09425023771246]
We prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $tilde O(d6)$, ignoring privacy and accuracy parameters.
This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of $2O(d)$ on the sample complexity.
arXiv Detail & Related papers (2020-12-07T18:17:46Z) - Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance
Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes.
We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution.
Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.