Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces
and ReLU Regression under Gaussian Marginals
- URL: http://arxiv.org/abs/2302.06512v1
- Date: Mon, 13 Feb 2023 16:46:23 GMT
- Title: Near-Optimal Cryptographic Hardness of Agnostically Learning Halfspaces
and ReLU Regression under Gaussian Marginals
- Authors: Ilias Diakonikolas, Daniel M. Kane, Lisheng Ren
- Abstract summary: We study the task of agnostically learning halfspaces under the Gaussian distribution.
We prove a near-optimal computational hardness result for this task.
- Score: 43.0867217287089
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the task of agnostically learning halfspaces under the Gaussian
distribution. Specifically, given labeled examples $(\mathbf{x},y)$ from an
unknown distribution on $\mathbb{R}^n \times \{ \pm 1\}$, whose marginal
distribution on $\mathbf{x}$ is the standard Gaussian and the labels $y$ can be
arbitrary, the goal is to output a hypothesis with 0-1 loss
$\mathrm{OPT}+\epsilon$, where $\mathrm{OPT}$ is the 0-1 loss of the
best-fitting halfspace. We prove a near-optimal computational hardness result
for this task, under the widely believed sub-exponential time hardness of the
Learning with Errors (LWE) problem. Prior hardness results are either
qualitatively suboptimal or apply to restricted families of algorithms. Our
techniques extend to yield near-optimal lower bounds for related problems,
including ReLU regression.
Related papers
- Learning Intersections of Halfspaces with Distribution Shift: Improved Algorithms and SQ Lower Bounds [9.036777309376697]
Klivans, Stavropoulos, and Vasilyan initiated the study of testable learning with distribution shift.
Their model deviates from all prior work in that no assumptions are made on $mathcalD'$.
arXiv Detail & Related papers (2024-04-02T23:34:39Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Near-Optimal SQ Lower Bounds for Agnostically Learning Halfspaces and
ReLUs under Gaussian Marginals [49.60752558064027]
We study the fundamental problems of agnostically learning halfspaces and ReLUs under Gaussian marginals.
Our lower bounds provide strong evidence that current upper bounds for these tasks are essentially best possible.
arXiv Detail & Related papers (2020-06-29T17:10:10Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.