Monotone Learning
- URL: http://arxiv.org/abs/2202.05246v3
- Date: Mon, 6 Nov 2023 18:58:28 GMT
- Title: Monotone Learning
- Authors: Olivier Bousquet and Amit Daniely and Haim Kaplan and Yishay Mansour
and Shay Moran and Uri Stemmer
- Abstract summary: We show that every learning algorithm A can be transformed to a monotone class with similar performance.
This demonstrates that one can provably avoid non-monotonic behaviour without compromising performance.
- Score: 86.77705135626186
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The amount of training-data is one of the key factors which determines the
generalization capacity of learning algorithms. Intuitively, one expects the
error rate to decrease as the amount of training-data increases. Perhaps
surprisingly, natural attempts to formalize this intuition give rise to
interesting and challenging mathematical questions. For example, in their
classical book on pattern recognition, Devroye, Gyorfi, and Lugosi (1996) ask
whether there exists a {monotone} Bayes-consistent algorithm. This question
remained open for over 25 years, until recently Pestov (2021) resolved it for
binary classification, using an intricate construction of a monotone
Bayes-consistent algorithm.
We derive a general result in multiclass classification, showing that every
learning algorithm A can be transformed to a monotone one with similar
performance. Further, the transformation is efficient and only uses a black-box
oracle access to A. This demonstrates that one can provably avoid non-monotonic
behaviour without compromising performance, thus answering questions asked by
Devroye et al (1996), Viering, Mey, and Loog (2019), Viering and Loog (2021),
and by Mhammedi (2021).
Our transformation readily implies monotone learners in a variety of
contexts: for example it extends Pestov's result to classification tasks with
an arbitrary number of labels. This is in contrast with Pestov's work which is
tailored to binary classification.
In addition, we provide uniform bounds on the error of the monotone
algorithm. This makes our transformation applicable in distribution-free
settings. For example, in PAC learning it implies that every learnable class
admits a monotone PAC learner. This resolves questions by Viering, Mey, and
Loog (2019); Viering and Loog (2021); Mhammedi (2021).
Related papers
- Machine Unlearning in Forgettability Sequence [22.497699136603877]
We identify key factor affecting unlearning difficulty and the performance of unlearning algorithms.
We propose a general unlearning framework, dubbed RSU, which consists of Ranking module and SeqUnlearn module.
arXiv Detail & Related papers (2024-10-09T01:12:07Z) - User-Level Differential Privacy With Few Examples Per User [73.81862394073308]
We consider the example-scarce regime, where each user has only a few examples, and obtain the following results.
For approximate-DP, we give a generic transformation of any item-level DP algorithm to a user-level DP algorithm.
We present a simple technique for adapting the exponential mechanism [McSherry, Talwar FOCS 2007] to the user-level setting.
arXiv Detail & Related papers (2023-09-21T21:51:55Z) - A General Framework for Sequential Decision-Making under Adaptivity
Constraints [112.79808969568253]
We study general sequential decision-making under two adaptivity constraints: rare policy switch and batch learning.
For the rare policy switch constraint, we provide an algorithm to achieve a $widetildemathcalO(sqrtK+K/B)$ regret with the number of batches.
For the batch learning constraint, we provide an algorithm that provides a $widetildemathcalO(sqrtK+K/B)$ regret with the number of batches.
arXiv Detail & Related papers (2023-06-26T07:20:25Z) - Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary
Search [6.461473289206789]
We show that gradient-free ML algorithms can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms.
We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election.
arXiv Detail & Related papers (2023-04-20T17:13:29Z) - Agnostic Multi-Robust Learning Using ERM [19.313739782029185]
A fundamental problem in robust learning is asymmetry: a learner needs to correctly classify every one of exponentially-many perturbations that an adversary might make to a test-time natural example.
In contrast, the attacker only needs to find one successful perturbation.
We introduce a novel multi-group setting and introduce a novel multi-robust learning problem.
arXiv Detail & Related papers (2023-03-15T21:30:14Z) - Variance-Dependent Regret Bounds for Linear Bandits and Reinforcement
Learning: Adaptivity and Computational Efficiency [90.40062452292091]
We present the first computationally efficient algorithm for linear bandits with heteroscedastic noise.
Our algorithm is adaptive to the unknown variance of noise and achieves an $tildeO(d sqrtsum_k = 1K sigma_k2 + d)$ regret.
We also propose a variance-adaptive algorithm for linear mixture Markov decision processes (MDPs) in reinforcement learning.
arXiv Detail & Related papers (2023-02-21T00:17:24Z) - Iterative Weak Learnability and Multi-Class AdaBoost [0.0]
We construct an efficient ensemble algorithm for the multi-class classification problem inspired by SAMME.
In contrast to SAMME, our algorithm's final hypothesis converges to the correct label with probability 1.
The sum of the training error and an additional term, that depends only on the sample size, bounds the generalization error of our algorithm as the Adaptive Boosting algorithm.
arXiv Detail & Related papers (2021-01-26T03:30:30Z) - Learning outside the Black-Box: The pursuit of interpretable models [78.32475359554395]
This paper proposes an algorithm that produces a continuous global interpretation of any given continuous black-box function.
Our interpretation represents a leap forward from the previous state of the art.
arXiv Detail & Related papers (2020-11-17T12:39:44Z) - Prob2Vec: Mathematical Semantic Embedding for Problem Retrieval in
Adaptive Tutoring [4.230510356675453]
It is difficult for humans to determine a similarity score that is consistent across a large enough training set.
We propose a hierarchical problem embedding algorithm, called Prob2Vec, that consists of abstraction and embedding steps.
Prob2Vec 96.88% accuracy on a problem similarity test, in contrast to 75% from directly applying state-of-the-art sentence embedding methods.
arXiv Detail & Related papers (2020-03-21T00:16:14Z) - Adversarial Online Learning with Changing Action Sets: Efficient
Algorithms with Approximate Regret Bounds [48.312484940846]
We revisit the problem of online learning with sleeping experts/bandits.
In each time step, only a subset of the actions are available for the algorithm to choose from.
We give an algorithm that provides a no-approximate-regret guarantee for the general sleeping expert/bandit problems.
arXiv Detail & Related papers (2020-03-07T02:13:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.