SQ Lower Bounds for Learning Single Neurons with Massart Noise
- URL: http://arxiv.org/abs/2210.09949v1
- Date: Tue, 18 Oct 2022 15:58:00 GMT
- Title: SQ Lower Bounds for Learning Single Neurons with Massart Noise
- Authors: Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren, Yuxin Sun
- Abstract summary: PAC learning a single neuron in the presence of Massart noise.
We prove that no efficient SQ algorithm can approximate the optimal error within any constant factor.
- Score: 40.1662767099183
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of PAC learning a single neuron in the presence of
Massart noise. Specifically, for a known activation function $f: \mathbb{R} \to
\mathbb{R}$, the learner is given access to labeled examples $(\mathbf{x}, y)
\in \mathbb{R}^d \times \mathbb{R}$, where the marginal distribution of
$\mathbf{x}$ is arbitrary and the corresponding label $y$ is a Massart
corruption of $f(\langle \mathbf{w}, \mathbf{x} \rangle)$. The goal of the
learner is to output a hypothesis $h: \mathbb{R}^d \to \mathbb{R}$ with small
squared loss. For a range of activation functions, including ReLUs, we
establish super-polynomial Statistical Query (SQ) lower bounds for this
learning problem. In more detail, we prove that no efficient SQ algorithm can
approximate the optimal error within any constant factor. Our main technical
contribution is a novel SQ-hard construction for learning $\{ \pm 1\}$-weight
Massart halfspaces on the Boolean hypercube that is interesting on its own
right.
Related papers
- Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and
ReLUs under Gaussian Marginals [49.60752558064027]
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals.
Our lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.
arXiv Detail & Related papers (2020-06-29T17:10:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.