Cryptographic Hardness of Learning Halfspaces with Massart Noise
- URL: http://arxiv.org/abs/2207.14266v1
- Date: Thu, 28 Jul 2022 17:50:53 GMT
- Title: Cryptographic Hardness of Learning Halfspaces with Massart Noise
- Authors: Ilias Diakonikolas, Daniel M. Kane, Pasin Manurangsi, Lisheng Ren
- Abstract summary: We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
- Score: 59.8587499110224
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the complexity of PAC learning halfspaces in the presence of Massart
noise. In this problem, we are given i.i.d. labeled examples $(\mathbf{x}, y)
\in \mathbb{R}^N \times \{ \pm 1\}$, where the distribution of $\mathbf{x}$ is
arbitrary and the label $y$ is a Massart corruption of $f(\mathbf{x})$, for an
unknown halfspace $f: \mathbb{R}^N \to \{ \pm 1\}$, with flipping probability
$\eta(\mathbf{x}) \leq \eta < 1/2$. The goal of the learner is to compute a
hypothesis with small 0-1 error. Our main result is the first computational
hardness result for this learning problem. Specifically, assuming the (widely
believed) subexponential-time hardness of the Learning with Errors (LWE)
problem, we show that no polynomial-time Massart halfspace learner can achieve
error better than $\Omega(\eta)$, even if the optimal 0-1 error is small,
namely $\mathrm{OPT} = 2^{-\log^{c} (N)}$ for any universal constant $c \in (0,
1)$. Prior work had provided qualitatively similar evidence of hardness in the
Statistical Query model. Our computational hardness result essentially resolves
the polynomial PAC learnability of Massart halfspaces, by showing that known
efficient learning algorithms for the problem are nearly best possible.
Related papers
- Provably learning a multi-head attention layer [55.2904547651831]
Multi-head attention layer is one of the key components of the transformer architecture that sets it apart from traditional feed-forward models.
In this work, we initiate the study of provably learning a multi-head attention layer from random examples.
We prove computational lower bounds showing that in the worst case, exponential dependence on $m$ is unavoidable.
arXiv Detail & Related papers (2024-02-06T15:39:09Z) - Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces
and ReLU Regression under Gaussian Marginals [43.0867217287089]
We study the task of agnostically learning halfspaces under the Gaussian distribution.
We prove a near-optimal computational hardness result for this task.
arXiv Detail & Related papers (2023-02-13T16:46:23Z) - SQ Lower Bounds for Learning Single Neurons with Massart Noise [40.1662767099183]
PAC learning a single neuron in the presence of Massart noise.
We prove that no efficient SQ algorithm can approximate the optimal error within any constant factor.
arXiv Detail & Related papers (2022-10-18T15:58:00Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and
ReLUs under Gaussian Marginals [49.60752558064027]
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals.
Our lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.
arXiv Detail & Related papers (2020-06-29T17:10:10Z) - Classification Under Misspecification: Halfspaces, Generalized Linear
Models, and Connections to Evolvability [39.01599245403753]
In particular, we study the problem of learning halfspaces under Massart noise with rate $eta$.
We show any SQ algorithm requires super-polynomially many queries to achieve $mathsfOPT + epsilon$.
We also study our algorithm for learning halfspaces under Massart noise empirically and find that it exhibits some appealing fairness properties.
arXiv Detail & Related papers (2020-06-08T17:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.