Improved Hardness Results for Learning Intersections of Halfspaces
- URL: http://arxiv.org/abs/2402.15995v1
- Date: Sun, 25 Feb 2024 05:26:35 GMT
- Title: Improved Hardness Results for Learning Intersections of Halfspaces
- Authors: Stefan Tiegel
- Abstract summary: We show strong (and surprisingly simple) lower bounds for weakly learning intersections of halfspaces in the improper setting.
We significantly narrow this gap by showing that even learning $omega(log log N)$ halfspaces in $N$ takes super-polynomial time.
Specifically, we show that for any $k$ halfspaces in dimension $N$ requires accuracy $N-Omega(k)$, or exponentially many queries.
- Score: 2.1393480341554736
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show strong (and surprisingly simple) lower bounds for weakly learning
intersections of halfspaces in the improper setting. Strikingly little is known
about this problem. For instance, it is not even known if there is a
polynomial-time algorithm for learning the intersection of only two halfspaces.
On the other hand, lower bounds based on well-established assumptions (such as
approximating worst-case lattice problems or variants of Feige's 3SAT
hypothesis) are only known (or are implied by existing results) for the
intersection of super-logarithmically many halfspaces [KS09,KS06,DSS16]. With
intersections of fewer halfspaces being only ruled out under less standard
assumptions [DV21] (such as the existence of local pseudo-random generators
with large stretch). We significantly narrow this gap by showing that even
learning $\omega(\log \log N)$ halfspaces in dimension $N$ takes
super-polynomial time under standard assumptions on worst-case lattice problems
(namely that SVP and SIVP are hard to approximate within polynomial factors).
Further, we give unconditional hardness results in the statistical query
framework. Specifically, we show that for any $k$ (even constant), learning $k$
halfspaces in dimension $N$ requires accuracy $N^{-\Omega(k)}$, or
exponentially many queries -- in particular ruling out SQ algorithms with
polynomial accuracy for $\omega(1)$ halfspaces. To the best of our knowledge
this is the first unconditional hardness result for learning a super-constant
number of halfspaces.
Our lower bounds are obtained in a unified way via a novel connection we make
between intersections of halfspaces and the so-called parallel pancakes
distribution [DKS17,BLPR19,BRST21] that has been at the heart of many lower
bound constructions in (robust) high-dimensional statistics in the past few
years.
Related papers
- Smoothed Analysis for Learning Concepts with Low Intrinsic Dimension [17.485243410774814]
In traditional models of supervised learning, the goal of a learner is to output a hypothesis that is competitive (to within $epsilon$) of the best fitting concept from some class.
We introduce a smoothed-analysis framework that requires a learner to compete only with the best agnostic.
We obtain the first algorithm forally learning intersections of $k$-halfspaces in time.
arXiv Detail & Related papers (2024-07-01T04:58:36Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Pseudonorm Approachability and Applications to Regret Minimization [73.54127663296906]
We convert high-dimensional $ell_infty$-approachability problems to low-dimensional pseudonorm approachability problems.
We develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for $ell$ and other norms.
arXiv Detail & Related papers (2023-02-03T03:19:14Z) - Simple Binary Hypothesis Testing under Local Differential Privacy and
Communication Constraints [8.261182037130407]
We study simple binary hypothesis testing under both local differential privacy (LDP) and communication constraints.
We qualify our results as either minimax optimal or instance optimal.
arXiv Detail & Related papers (2023-01-09T18:36:49Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Hardness of Agnostically Learning Halfspaces from Worst-Case Lattice
Problems [0.49495085874952904]
In particular, we show that under this assumption there is no efficient algorithm that outputs any binary hypothesis, not necessarily a halfspace.
It is inspired by a sequence of recent works showing hardness of learning well-separated Gaussian mixtures based on worst-case lattice problems.
arXiv Detail & Related papers (2022-07-28T11:44:39Z) - Near-Optimal Statistical Query Lower Bounds for Agnostically Learning
Intersections of Halfspaces with Gaussian Marginals [10.06689891744466]
We consider the well-studied problem of learning intersections halfspaces under the Gaussian distribution in the challenging emphagnostic learning model.
We prove two variants of our lower bound, each of which combines ingredients from Diakonikolas et al. (2021) with (an extension of) a different earlier approach for SQ lower bounds for the Boolean setting.
arXiv Detail & Related papers (2022-02-10T15:34:10Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.