The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem
- URL: http://arxiv.org/abs/2409.11597v1
- Date: Tue, 17 Sep 2024 23:09:25 GMT
- Title: The Sample Complexity of Smooth Boosting and the Tightness of the Hardcore Theorem
- Authors: Guy Blanc, Alexandre Hayderi, Caleb Koch, Li-Yang Tan,
- Abstract summary: Smooth boosters generate distributions that do not place too much weight on any given example.
Originally introduced for their noise-tolerant properties, such boosters have also found applications in differential privacy, mildly, and quantum learning theory.
- Score: 53.446980306786095
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Smooth boosters generate distributions that do not place too much weight on any given example. Originally introduced for their noise-tolerant properties, such boosters have also found applications in differential privacy, reproducibility, and quantum learning theory. We study and settle the sample complexity of smooth boosting: we exhibit a class that can be weak learned to $\gamma$-advantage over smooth distributions with $m$ samples, for which strong learning over the uniform distribution requires $\tilde{\Omega}(1/\gamma^2)\cdot m$ samples. This matches the overhead of existing smooth boosters and provides the first separation from the setting of distribution-independent boosting, for which the corresponding overhead is $O(1/\gamma)$. Our work also sheds new light on Impagliazzo's hardcore theorem from complexity theory, all known proofs of which can be cast in the framework of smooth boosting. For a function $f$ that is mildly hard against size-$s$ circuits, the hardcore theorem provides a set of inputs on which $f$ is extremely hard against size-$s'$ circuits. A downside of this important result is the loss in circuit size, i.e. that $s' \ll s$. Answering a question of Trevisan, we show that this size loss is necessary and in fact, the parameters achieved by known proofs are the best possible.
Related papers
- The Cost of Parallelizing Boosting [1.9235143628887907]
We study the cost of parallelizing weak-to-strong boosting algorithms for learning.
We show that even "slight" parallelization of boosting requires an exponential blow-up in the complexity of training.
arXiv Detail & Related papers (2024-02-23T07:03:52Z) - A Finite Sample Complexity Bound for Distributionally Robust Q-learning [17.96094201655567]
We consider a reinforcement learning setting in which the deployment environment is different from the training environment.
Applying a robust Markov decision processes formulation, we extend the distributionally robust $Q$-learning framework studied in Liu et al.
This is the first sample complexity result for the model-free robust RL problem.
arXiv Detail & Related papers (2023-02-26T01:15:32Z) - Stochastic Approximation Approaches to Group Distributionally Robust
Optimization [96.26317627118912]
Group distributionally robust optimization (GDRO)
Online learning techniques to reduce the number of samples required in each round from $m$ to $1$, keeping the same sample.
A novel formulation of weighted GDRO, which allows us to derive distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Sample Complexity Bounds for Learning High-dimensional Simplices in
Noisy Regimes [5.526935605535376]
We find a sample complexity bound for learning a simplex from noisy samples.
We show that as long as $mathrmSNRgeOmegaleft(K1/2right)$, the sample complexity of the noisy regime has the same order to that of the noiseless case.
arXiv Detail & Related papers (2022-09-09T23:35:25Z) - Settling the Sample Complexity of Model-Based Offline Reinforcement
Learning [50.5790774201146]
offline reinforcement learning (RL) learns using pre-collected data without further exploration.
Prior algorithms or analyses either suffer from suboptimal sample complexities or incur high burn-in cost to reach sample optimality.
We demonstrate that the model-based (or "plug-in") approach achieves minimax-optimal sample complexity without burn-in cost.
arXiv Detail & Related papers (2022-04-11T17:26:19Z) - Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.