Related papers: Boosting Simple Learners

Boosting Simple Learners

URL: http://arxiv.org/abs/2001.11704v8
Date: Fri, 16 Jun 2023 14:06:37 GMT
Title: Boosting Simple Learners
Authors: Noga Alon and Alon Gonen and Elad Hazan and Shay Moran
Abstract summary: We focus on two main questions: (i) Complexity: How many weak hypotheses are needed to produce an accurate hypothesis? We design a novel boosting algorithm which circumvents a classical lower bound by Freund and Schapire ('95, '12) We provide an affirmative answer to the second question for well-studied corollary classes, including half-spaces and decision stumps.
Score: 45.09968166110557
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Boosting is a celebrated machine learning approach which is based on the idea of combining weak and moderately inaccurate hypotheses to a strong and accurate one. We study boosting under the assumption that the weak hypotheses belong to a class of bounded capacity. This assumption is inspired by the common convention that weak hypotheses are "rules-of-thumbs" from an "easy-to-learn class". (Schapire and Freund~'12, Shalev-Shwartz and Ben-David '14.) Formally, we assume the class of weak hypotheses has a bounded VC dimension. We focus on two main questions: (i) Oracle Complexity: How many weak hypotheses are needed to produce an accurate hypothesis? We design a novel boosting algorithm and demonstrate that it circumvents a classical lower bound by Freund and Schapire ('95, '12). Whereas the lower bound shows that $\Omega({1}/{\gamma^2})$ weak hypotheses with $\gamma$-margin are sometimes necessary, our new method requires only $\tilde{O}({1}/{\gamma})$ weak hypothesis, provided that they belong to a class of bounded VC dimension. Unlike previous boosting algorithms which aggregate the weak hypotheses by majority votes, the new boosting algorithm uses more complex ("deeper") aggregation rules. We complement this result by showing that complex aggregation rules are in fact necessary to circumvent the aforementioned lower bound. (ii) Expressivity: Which tasks can be learned by boosting weak hypotheses from a bounded VC class? Can complex concepts that are "far away" from the class be learned? Towards answering the first question we {introduce combinatorial-geometric parameters which capture expressivity in boosting.} As a corollary we provide an affirmative answer to the second question for well-studied classes, including half-spaces and decision stumps. Along the way, we establish and exploit connections with Discrepancy Theory.

Related papers

Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions. We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z)
Local Borsuk-Ulam, Stability, and Replicability [16.6959756920423]
We show that both list-replicable and globally stable learning are impossible in the PAC setting. Specifically, we establish optimal bounds for list replicability and global stability numbers in finite classes.
arXiv Detail & Related papers (2023-11-02T21:10:16Z)
Multiclass Boosting: Simple and Intuitive Weak Learning Criteria [72.71096438538254]
We give a simple and efficient boosting algorithm, that does not require realizability assumptions. We present a new result on boosting for list learners, as well as provide a novel proof for the characterization of multiclass PAC learning.
arXiv Detail & Related papers (2023-07-02T19:26:58Z)
Lower Bounds for Learning in Revealing POMDPs [88.23337313766355]
This paper studies the fundamental limits of reinforcement learning (RL) in the challenging emphpartially observable setting. For emphmulti-step revealing POMDPs, we show that the latent state-space dependence is at least $Omega(S1.5)$ in the sample complexity.
arXiv Detail & Related papers (2023-02-02T18:59:30Z)
The Optimal Choice of Hypothesis Is the Weakest, Not the Shortest [0.0]
One strategy is to choose the shortest, equating the ability to compress information with the ability to generalise. We show that compression is neither necessary nor sufficient to maximise performance. We argue this demonstrates that weakness is a far better proxy, and explains why Deepmind's Apperception Engine is able to generalise effectively.
arXiv Detail & Related papers (2023-01-30T15:29:40Z)
Comparative Learning: A Sample Complexity Theory for Two Hypothesis Classes [5.194264506657145]
We introduce comparative learning as a combination of the realizable and agnostic settings in PAC learning. Even when both $S$ and $B$ have infinite VC dimensions, comparative learning can still have a small sample complexity. We show that the sample complexity of comparative learning is characterized by the mutual VC dimension $mathsfVC(S,B)$.
arXiv Detail & Related papers (2022-11-16T18:38:24Z)
Chaos is a Ladder: A New Theoretical Understanding of Contrastive Learning via Augmentation Overlap [64.60460828425502]
We propose a new guarantee on the downstream performance of contrastive learning. Our new theory hinges on the insight that the support of different intra-class samples will become more overlapped under aggressive data augmentations. We propose an unsupervised model selection metric ARC that aligns well with downstream accuracy.
arXiv Detail & Related papers (2022-03-25T05:36:26Z)
Sample-efficient proper PAC learning with approximate differential privacy [51.09425023771246]
We prove that the sample complexity of properly learning a class of Littlestone dimension $d$ with approximate differential privacy is $tilde O(d6)$, ignoring privacy and accuracy parameters. This result answers a question of Bun et al. (FOCS 2020) by improving upon their upper bound of $2O(d)$ on the sample complexity.
arXiv Detail & Related papers (2020-12-07T18:17:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.