Information-Computation Tradeoffs for Learning Margin Halfspaces with
Random Classification Noise
- URL: http://arxiv.org/abs/2306.16352v1
- Date: Wed, 28 Jun 2023 16:33:39 GMT
- Title: Information-Computation Tradeoffs for Learning Margin Halfspaces with
Random Classification Noise
- Authors: Ilias Diakonikolas, Jelena Diakonikolas, Daniel M. Kane, Puqian Wang,
Nikos Zarifis
- Abstract summary: We study the problem of PAC $gamma$-margin halfspaces with Random Classification Noise.
We establish an information-computation tradeoff suggesting an inherent gap between the sample complexity of the problem and the sample complexity of computationally efficient algorithms.
- Score: 50.64137465792738
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of PAC learning $\gamma$-margin halfspaces with Random
Classification Noise. We establish an information-computation tradeoff
suggesting an inherent gap between the sample complexity of the problem and the
sample complexity of computationally efficient algorithms. Concretely, the
sample complexity of the problem is $\widetilde{\Theta}(1/(\gamma^2
\epsilon))$. We start by giving a simple efficient algorithm with sample
complexity $\widetilde{O}(1/(\gamma^2 \epsilon^2))$. Our main result is a lower
bound for Statistical Query (SQ) algorithms and low-degree polynomial tests
suggesting that the quadratic dependence on $1/\epsilon$ in the sample
complexity is inherent for computationally efficient algorithms. Specifically,
our results imply a lower bound of $\widetilde{\Omega}(1/(\gamma^{1/2}
\epsilon^2))$ on the sample complexity of any efficient SQ learner or
low-degree test.
Related papers
- Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Statistical-Computational Tradeoffs in Mixed Sparse Linear Regression [20.00109111254507]
We show that the problem suffers from a $frackSNR2$-to-$frack2SNR2$ statistical-to-computational gap.
We also analyze a simple thresholding algorithm which, outside of the narrow regime where the problem is hard, solves the associated mixed regression detection problem.
arXiv Detail & Related papers (2023-03-03T18:03:49Z) - Robust Sparse Mean Estimation via Sum of Squares [42.526664955704746]
We study the problem of high-dimensional sparse mean estimation in the presence of an $epsilon$-fraction of adversarial outliers.
Our algorithms follow the Sum-of-Squares based, to algorithms approach.
arXiv Detail & Related papers (2022-06-07T16:49:54Z) - Randomized Stochastic Variance-Reduced Methods for Stochastic Bilevel
Optimization [62.87181271021217]
We consider non-SBO problems that have many applications in machine learning.
This paper proposes fast randomized algorithms for non-SBO problems.
arXiv Detail & Related papers (2021-05-05T18:28:42Z) - The Complexity of Adversarially Robust Proper Learning of Halfspaces
with Agnostic Noise [67.27523616312428]
We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent PAC model.
We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem.
arXiv Detail & Related papers (2020-07-30T04:18:51Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Proper Learning, Helly Number, and an Optimal SVM Bound [29.35254938542589]
We characterize classes for which the optimal sample complexity can be achieved by a proper learning algorithm.
We show that the dual Helly number is bounded if and only if there is a proper joint dependence on $epsilon$ and $delta$.
arXiv Detail & Related papers (2020-05-24T18:11:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.