Related papers: Universal Rates for Multiclass Learning

Universal Rates for Multiclass Learning

URL: http://arxiv.org/abs/2307.02066v1
Date: Wed, 5 Jul 2023 07:12:58 GMT
Title: Universal Rates for Multiclass Learning
Authors: Steve Hanneke, Shay Moran, Qian Zhang
Abstract summary: We study universal rates for multiclass classification, establishing the optimal rates (up to log factors) for all hypothesis classes. We also resolve an open question regarding the equivalence of classes having infinite Graph-Littlestone (GL) trees versus infinite Natarajan-Littlestone (NL) trees.
Score: 28.18556410009232
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study universal rates for multiclass classification, establishing the optimal rates (up to log factors) for all hypothesis classes. This generalizes previous results on binary classification (Bousquet, Hanneke, Moran, van Handel, and Yehudayoff, 2021), and resolves an open question studied by Kalavasis, Velegkas, and Karbasi (2022) who handled the multiclass setting with a bounded number of class labels. In contrast, our result applies for any countable label space. Even for finite label space, our proofs provide a more precise bounds on the learning curves, as they do not depend on the number of labels. Specifically, we show that any class admits exponential rates if and only if it has no infinite Littlestone tree, and admits (near-)linear rates if and only if it has no infinite Daniely-Shalev-Shwartz-Littleston (DSL) tree, and otherwise requires arbitrarily slow rates. DSL trees are a new structure we define in this work, in which each node of the tree is given by a pseudo-cube of possible classifications of a given set of points. Pseudo-cubes are a structure, rooted in the work of Daniely and Shalev-Shwartz (2014), and recently shown by Brukhim, Carmon, Dinur, Moran, and Yehudayoff (2022) to characterize PAC learnability (i.e., uniform rates) for multiclass classification. We also resolve an open question of Kalavasis, Velegkas, and Karbasi (2022) regarding the equivalence of classes having infinite Graph-Littlestone (GL) trees versus infinite Natarajan-Littlestone (NL) trees, showing that they are indeed equivalent.

Related papers

IsoSEL: Isometric Structural Entropy Learning for Deep Graph Clustering in Hyperbolic Space [57.036143666293334]
Graph clustering is a longstanding topic in machine learning. In this paper, we study a challenging yet practical problem: deep graph clustering without K considering the imbalance in reality. We present a novel IsoSEL framework for deep graph clustering, where we design a hyperbolic neural network to learn partitioning tree in the Lorentz model of hyperbolic space.
arXiv Detail & Related papers (2025-04-14T08:21:41Z)
Harnessing Superclasses for Learning from Hierarchical Databases [1.835004446596942]
In many large-scale classification problems, classes are organized in a known hierarchy, typically represented as a tree. We introduce a loss for this type of supervised hierarchical classification. Our approach does not entail any significant additional computational cost compared with the loss of cross-entropy.
arXiv Detail & Related papers (2024-11-25T14:39:52Z)
Ramsey Theorems for Trees and a General 'Private Learning Implies Online Learning' Theorem [26.292576184028924]
This work continues to investigate the link between differentially private (DP) and online learning. We show that for general classification tasks, DP learnability implies online learnability.
arXiv Detail & Related papers (2024-07-10T15:43:30Z)
The Real Price of Bandit Information in Multiclass Classification [73.17969992976501]
We revisit the classical problem of multiclass classification with bandit feedback. We present a new bandit classification algorithm that guarantees regret $smashwidetildeO(|H|+sqrtT)$.
arXiv Detail & Related papers (2024-05-16T12:11:09Z)
Contrastive Meta-Learning for Few-shot Node Classification [54.36506013228169]
Few-shot node classification aims to predict labels for nodes on graphs with only limited labeled nodes as references. We create a novel contrastive meta-learning framework on graphs, named COSMIC, with two key designs.
arXiv Detail & Related papers (2023-06-27T02:22:45Z)
GraphSHA: Synthesizing Harder Samples for Class-Imbalanced Node Classification [64.85392028383164]
Class imbalance is the phenomenon that some classes have much fewer instances than others. Recent studies find that off-the-shelf Graph Neural Networks (GNNs) would under-represent minor class samples. We propose a general framework GraphSHA by Synthesizing HArder minor samples.
arXiv Detail & Related papers (2023-06-16T04:05:58Z)
Multiclass Online Learning and Uniform Convergence [34.21248304961989]
We study multiclass classification in the adversarial online learning setting. We prove that any multiclass concept class is agnostically learnable if and only if its Littlestone dimension is finite.
arXiv Detail & Related papers (2023-03-30T21:35:48Z)
Multiclass Learnability Beyond the PAC Framework: Universal Rates and Partial Concept Classes [31.2676304636432]
We study the problem of multiclass classification with a bounded number of different labels $k$, in the realizable setting. We extend the traditional PAC model to a) distribution-dependent learning rates, and b) learning rates under data-dependent assumptions.
arXiv Detail & Related papers (2022-10-05T14:36:27Z)
A Characterization of Multiclass Learnability [18.38631912121182]
We characterize multiclass PAC learnability through the DS dimension, a dimension defined by Daniely and Shalev-Shwartz 2014. In the list learning setting, instead of predicting a single outcome for a given unseen input, the goal is to provide a short menu of predictions. Our second main result concerns the Natarajan dimension, which has been a central candidate for characterizing multiclass learnability.
arXiv Detail & Related papers (2022-03-03T07:41:54Z)
Monotone Learning [86.77705135626186]
We show that every learning algorithm A can be transformed to a monotone class with similar performance. This demonstrates that one can provably avoid non-monotonic behaviour without compromising performance.
arXiv Detail & Related papers (2022-02-10T18:51:57Z)
Sample-efficient proper PAC learning with approximate differential privacy [51.09425023771246]
We prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $tilde O(d6)$, ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of $2O(d)$ on the sample complexity.
arXiv Detail & Related papers (2020-12-07T18:17:46Z)
Forest R-CNN: Large-Vocabulary Long-Tailed Object Detection and Instance Segmentation [75.93960390191262]
We exploit prior knowledge of the relations among object categories to cluster fine-grained classes into coarser parent classes. We propose a simple yet effective resampling method, NMS Resampling, to re-balance the data distribution. Our method, termed as Forest R-CNN, can serve as a plug-and-play module being applied to most object recognition models.
arXiv Detail & Related papers (2020-08-13T03:52:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.