Hardness of Agnostically Learning Halfspaces from Worst-Case Lattice
Problems
- URL: http://arxiv.org/abs/2207.14030v1
- Date: Thu, 28 Jul 2022 11:44:39 GMT
- Title: Hardness of Agnostically Learning Halfspaces from Worst-Case Lattice
Problems
- Authors: Stefan Tiegel
- Abstract summary: In particular, we show that under this assumption there is no efficient algorithm that outputs any binary hypothesis, not necessarily a halfspace.
It is inspired by a sequence of recent works showing hardness of learning well-separated Gaussian mixtures based on worst-case lattice problems.
- Score: 0.49495085874952904
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show hardness of improperly learning halfspaces in the agnostic model
based on worst-case lattice problems, e.g., approximating shortest vectors
within polynomial factors. In particular, we show that under this assumption
there is no efficient algorithm that outputs any binary hypothesis, not
necessarily a halfspace, achieving misclassfication error better than $\frac 1
2 - \epsilon$ even if the optimal misclassification error is as small is as
small as $\delta$. Here, $\epsilon$ can be smaller than the inverse of any
polynomial in the dimension and $\delta$ as small as
$\mathrm{exp}\left(-\Omega\left(\log^{1-c}(d)\right)\right)$, where $0 < c < 1$
is an arbitrary constant and $d$ is the dimension.
Previous hardness results [Daniely16] of this problem were based on
average-case complexity assumptions, specifically, variants of Feige's random
3SAT hypothesis. Our work gives the first hardness for this problem based on a
worst-case complexity assumption. It is inspired by a sequence of recent works
showing hardness of learning well-separated Gaussian mixtures based on
worst-case lattice problems.
Related papers
- Improved Hardness Results for Learning Intersections of Halfspaces [2.1393480341554736]
We show strong (and surprisingly simple) lower bounds for weakly learning intersections of halfspaces in the improper setting.
We significantly narrow this gap by showing that even learning $omega(log log N)$ halfspaces in $N$ takes super-polynomial time.
Specifically, we show that for any $k$ halfspaces in dimension $N$ requires accuracy $N-Omega(k)$, or exponentially many queries.
arXiv Detail & Related papers (2024-02-25T05:26:35Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Classification Under Misspecification: Halfspaces, Generalized Linear
Models, and Connections to Evolvability [39.01599245403753]
In particular, we study the problem of learning halfspaces under Massart noise with rate $eta$.
We show any SQ algorithm requires super-polynomially many queries to achieve $mathsfOPT + epsilon$.
We also study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties.
arXiv Detail & Related papers (2020-06-08T17:59:11Z) - Private Learning of Halfspaces: Simplifying the Construction and
Reducing the Sample Complexity [63.29100726064574]
We present a differentially private learner for halfspaces over a finite grid $G$ in $mathbbRd$ with sample complexity $approx d2.5cdot 2log*|G|$.
The building block for our learner is a new differentially private algorithm for approximately solving the linear feasibility problem.
arXiv Detail & Related papers (2020-04-16T16:12:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.