Related papers: Monotone Learning

Monotone Learning

URL: http://arxiv.org/abs/2202.05246v3
Date: Mon, 6 Nov 2023 18:58:28 GMT
Title: Monotone Learning
Authors: Olivier Bousquet and Amit Daniely and Haim Kaplan and Yishay Mansour and Shay Moran and Uri Stemmer
Abstract summary: We show that every learning algorithm A can be transformed to a monotone class with similar performance. This demonstrates that one can provably avoid non-monotonic behaviour without compromising performance.
Score: 86.77705135626186
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: The amount of training-data is one of the key factors which determines the generalization capacity of learning algorithms. Intuitively, one expects the error rate to decrease as the amount of training-data increases. Perhaps surprisingly, natural attempts to formalize this intuition give rise to interesting and challenging mathematical questions. For example, in their classical book on pattern recognition, Devroye, Gyorfi, and Lugosi (1996) ask whether there exists a {monotone} Bayes-consistent algorithm. This question remained open for over 25 years, until recently Pestov (2021) resolved it for binary classification, using an intricate construction of a monotone Bayes-consistent algorithm. We derive a general result in multiclass classification, showing that every learning algorithm A can be transformed to a monotone one with similar performance. Further, the transformation is efficient and only uses a black-box oracle access to A. This demonstrates that one can provably avoid non-monotonic behaviour without compromising performance, thus answering questions asked by Devroye et al (1996), Viering, Mey, and Loog (2019), Viering and Loog (2021), and by Mhammedi (2021). Our transformation readily implies monotone learners in a variety of contexts: for example it extends Pestov's result to classification tasks with an arbitrary number of labels. This is in contrast with Pestov's work which is tailored to binary classification. In addition, we provide uniform bounds on the error of the monotone algorithm. This makes our transformation applicable in distribution-free settings. For example, in PAC learning it implies that every learnable class admits a monotone PAC learner. This resolves questions by Viering, Mey, and Loog (2019); Viering and Loog (2021); Mhammedi (2021).

Related papers

Learning with Positive and Imperfect Unlabeled Data [7.04316974339151]
We study the problem of learning binary classifiers from positive and unlabeled data when the unlabeled data distribution is shifted. Our main results on PIU learning are the characterizations of the sample complexity of PIU learning and a computationally and sample-efficient algorithm achieving a misclassification error.
arXiv Detail & Related papers (2025-04-14T17:19:29Z)
Simplifying Adversarially Robust PAC Learning with Tolerance [9.973499768462888]
We show the existence of a simpler learner that achieves a sample complexity linear in the VC-dimension without requiring additional assumptions on H. Even though our learner is improper, it is "almost proper" in the sense that it outputs a hypothesis that is "similar" to a hypothesis in H. We also use the ideas from our algorithm to construct a semi-supervised learner in the tolerant setting.
arXiv Detail & Related papers (2025-02-11T03:48:40Z)
Machine Unlearning in Forgettability Sequence [22.497699136603877]
We identify key factor affecting unlearning difficulty and the performance of unlearning algorithms. We propose a general unlearning framework, dubbed RSU, which consists of Ranking module and SeqUnlearn module.
arXiv Detail & Related papers (2024-10-09T01:12:07Z)
User-Level Differential Privacy With Few Examples Per User [73.81862394073308]
We consider the example-scarce regime, where each user has only a few examples, and obtain the following results. For approximate-DP, we give a generic transformation of any item-level DP algorithm to a user-level DP algorithm. We present a simple technique for adapting the exponential mechanism [McSherry, Talwar FOCS 2007] to the user-level setting.
arXiv Detail & Related papers (2023-09-21T21:51:55Z)
A General Framework for Sequential Decision-Making under Adaptivity Constraints [112.79808969568253]
We study general sequential decision-making under two adaptivity constraints: rare policy switch and batch learning. For the rare policy switch constraint, we provide an algorithm to achieve a $widetildemathcalO(sqrtK+K/B)$ regret with the number of batches. For the batch learning constraint, we provide an algorithm that provides a $widetildemathcalO(sqrtK+K/B)$ regret with the number of batches.
arXiv Detail & Related papers (2023-06-26T07:20:25Z)
Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary Search [6.461473289206789]
We show that gradient-free ML algorithms can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms. We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election.
arXiv Detail & Related papers (2023-04-20T17:13:29Z)
Agnostic Multi-Robust Learning Using ERM [19.313739782029185]
A fundamental problem in robust learning is asymmetry: a learner needs to correctly classify every one of exponentially-many perturbations that an adversary might make to a test-time natural example. In contrast, the attacker only needs to find one successful perturbation. We introduce a novel multi-group setting and introduce a novel multi-robust learning problem.
arXiv Detail & Related papers (2023-03-15T21:30:14Z)
Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement Learning: Adaptivity and Computational Efficiency [90.40062452292091]
We present the first computationally efficient algorithm for linear bandits with heteroscedastic noise. Our algorithm is adaptive to the unknown variance of noise and achieves an $tildeO(d sqrtsum_k = 1K sigma_k2 + d)$ regret. We also propose a variance-adaptive algorithm for linear mixture Markov decision processes (MDPs) in reinforcement learning.
arXiv Detail & Related papers (2023-02-21T00:17:24Z)
Iterative Weak Learnability and Multi-Class AdaBoost [0.0]
We construct an efficient ensemble algorithm for the multi-class classification problem inspired by SAMME. In contrast to SAMME, our algorithm's final hypothesis converges to the correct label with probability 1. The sum of the training error and an additional term, that depends only on the sample size, bounds the generalization error of our algorithm as the Adaptive Boosting algorithm.
arXiv Detail & Related papers (2021-01-26T03:30:30Z)
Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function. Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z)
Prob2Vec: Mathematical Semantic Embedding for Problem Retrieval in Adaptive Tutoring [4.230510356675453]
It is difficult for humans to determine a similarity score that is consistent across a large enough training set. We propose a hierarchical problem embedding algorithm, called Prob2Vec, that consists of abstraction and embedding steps. Prob2Vec 96.88% accuracy on a problem similarity test, in contrast to 75% from directly applying state-of-the-art sentence embedding methods.
arXiv Detail & Related papers (2020-03-21T00:16:14Z)
Adversarial Online Learning with Changing Action Sets: Efficient Algorithms with Approximate Regret Bounds [48.312484940846]
We revisit the problem of online learning with sleeping experts/bandits. In each time step, only a subset of the actions are available for the algorithm to choose from. We give an algorithm that provides a no-approximate-regret guarantee for the general sleeping expert/bandit problems.
arXiv Detail & Related papers (2020-03-07T02:13:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.