Sample-Near-Optimal Agnostic Boosting with Improved Running Time
- URL: http://arxiv.org/abs/2601.11265v2
- Date: Mon, 19 Jan 2026 12:56:49 GMT
- Title: Sample-Near-Optimal Agnostic Boosting with Improved Running Time
- Authors: Arthur da Cunha, Mikael Møller Høgsgaard, Andrea Paudice,
- Abstract summary: Boosting is a powerful method that turns weak learners, which perform only slightly better than random guessing, into strong learners with high accuracy.<n>We propose the first agnostic boosting algorithm with nearoptimal sample complexity, running in time in the sample size when considering the other parameters of the problem fixed.
- Score: 10.9284286893389
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Boosting is a powerful method that turns weak learners, which perform only slightly better than random guessing, into strong learners with high accuracy. While boosting is well understood in the classic setting, it is less so in the agnostic case, where no assumptions are made about the data. Indeed, only recently was the sample complexity of agnostic boosting nearly settled arXiv:2503.09384, but the known algorithm achieving this bound has exponential running time. In this work, we propose the first agnostic boosting algorithm with near-optimal sample complexity, running in time polynomial in the sample size when considering the other parameters of the problem fixed.
Related papers
- Revisiting Agnostic Boosting [15.705418399834278]
We propose a new agnostic boosting algorithm with substantially improved sample complexity compared to prior works under very general assumptions.<n>Our approach is based on a reduction to the realizable case, followed by a margin-based filtering of high-quality hypotheses.
arXiv Detail & Related papers (2025-03-12T13:29:33Z) - Sample-Optimal Agnostic Boosting with Unlabeled Data [19.15484761265653]
Boosting provides a framework for constructing accurate learning algorithms from inaccurate rules of thumb.<n>This paper highlights an unexpected and previously unexplored avenue of improvement: unlabeled samples.<n>We show that the total number of samples needed, unlabeled and labeled inclusive, is never more than that for the best known boosting algorithm.
arXiv Detail & Related papers (2025-03-06T18:54:42Z) - Sample-Efficient Agnostic Boosting [19.15484761265653]
Empirical Risk Minimization (ERM) outstrips the agnostic boosting methodology in being quadratically more sample efficient than all known boosting algorithms.
A key feature of our algorithm is that it leverages the ability to reuse samples across multiple rounds of boosting, while guaranteeing a generalization error strictly better than those obtained by blackbox applications of uniform convergence arguments.
arXiv Detail & Related papers (2024-10-31T04:50:29Z) - The Many Faces of Optimal Weak-to-Strong Learning [10.985323882432086]
We present a new and surprisingly simple Boosting algorithm that obtains a provably optimal sample complexity.
Our pilot empirical study suggests that our new algorithm might outperform previous algorithms on large data sets.
arXiv Detail & Related papers (2024-08-30T09:38:51Z) - Already Moderate Population Sizes Provably Yield Strong Robustness to Noise [53.27802701790209]
We show that two evolutionary algorithms can tolerate constant noise probabilities without increasing the runtime on the OneMax benchmark.
Our results are based on a novel proof argument that the noiseless offspring can be seen as a biased uniform crossover between the parent and the noisy offspring.
arXiv Detail & Related papers (2024-04-02T16:35:52Z) - AdaBoost is not an Optimal Weak to Strong Learner [13.244225605926065]
We show that the sample complexity of AdaBoost, and other classic variations thereof, are sub-optimal by at least one logarithmic factor in the desired accuracy of the strong learner.
arXiv Detail & Related papers (2023-01-27T07:37:51Z) - Explicit Second-Order Min-Max Optimization: Practical Algorithms and Complexity Analysis [71.05708939639537]
We propose and analyze several inexact regularized Newton-type methods for finding a global saddle point of emphconcave unconstrained problems.<n>Our method improves the existing line-search-based min-max optimization by shaving off an $O(loglog(1/eps)$ factor in the required number of Schur decompositions.
arXiv Detail & Related papers (2022-10-23T21:24:37Z) - BiAdam: Fast Adaptive Bilevel Optimization Methods [104.96004056928474]
Bilevel optimization has attracted increased interest in machine learning due to its many applications.
We provide a useful analysis framework for both the constrained and unconstrained optimization.
arXiv Detail & Related papers (2021-06-21T20:16:40Z) - Adaptive Sampling for Best Policy Identification in Markov Decision
Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model.
The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z) - Single-Timescale Stochastic Nonconvex-Concave Optimization for Smooth
Nonlinear TD Learning [145.54544979467872]
We propose two single-timescale single-loop algorithms that require only one data point each step.
Our results are expressed in a form of simultaneous primal and dual side convergence.
arXiv Detail & Related papers (2020-08-23T20:36:49Z) - Refined bounds for algorithm configuration: The knife-edge of dual class
approximability [94.83809668933021]
We investigate how large should a training set be to ensure that a parameter's average metrics performance over the training set is close to its expected, future performance.
We show that if this approximation holds under the L-infinity norm, we can provide strong sample complexity bounds.
We empirically evaluate our bounds in the context of integer programming, one of the most powerful tools in computer science.
arXiv Detail & Related papers (2020-06-21T15:32:21Z) - Learning to Accelerate Heuristic Searching for Large-Scale Maximum
Weighted b-Matching Problems in Online Advertising [51.97494906131859]
Bipartite b-matching is fundamental in algorithm design, and has been widely applied into economic markets, labor markets, etc.
Existing exact and approximate algorithms usually fail in such settings due to either requiring intolerable running time or too much computation resource.
We propose textttNeuSearcher which leverages the knowledge learned from previously instances to solve new problem instances.
arXiv Detail & Related papers (2020-05-09T02:48:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.