Robustly Learning a Single Neuron via Sharpness
- URL: http://arxiv.org/abs/2306.07892v1
- Date: Tue, 13 Jun 2023 16:34:02 GMT
- Title: Robustly Learning a Single Neuron via Sharpness
- Authors: Puqian Wang, Nikos Zarifis, Ilias Diakonikolas, Jelena Diakonikolas
- Abstract summary: We study the problem of learning a single neuron with respect to the $L2$-loss in the presence of adversarial label noise.
We give an efficient algorithm that, for a broad family of activations including ReLUs, approximates the optimal $L2$-error within a constant factor.
- Score: 45.40045240987339
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning a single neuron with respect to the
$L_2^2$-loss in the presence of adversarial label noise. We give an efficient
algorithm that, for a broad family of activations including ReLUs, approximates
the optimal $L_2^2$-error within a constant factor. Our algorithm applies under
much milder distributional assumptions compared to prior work. The key
ingredient enabling our results is a novel connection to local error bounds
from optimization theory.
Related papers
- Robustly Learning Single-Index Models via Alignment Sharpness [40.886706402941435]
We study the problem of learning Single-Index Models under the $L2$ loss in the agnostic model.
We give an efficient learning algorithm, achieving a constant factor approximation to the optimal loss.
arXiv Detail & Related papers (2024-02-27T18:48:07Z) - Smoothness Adaptive Hypothesis Transfer Learning [8.557392136621894]
Smoothness Adaptive Transfer Learning (SATL) is a two-phase kernel ridge regression(KRR)-based algorithm.
We first prove that employing the misspecified fixed bandwidth Gaussian kernel in target-only KRR learning can achieve minimax optimality.
We derive the minimax lower bound of the learning problem in excess risk and show that SATL enjoys a matching upper bound up to a logarithmic factor.
arXiv Detail & Related papers (2024-02-22T21:02:19Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Learning Narrow One-Hidden-Layer ReLU Networks [30.63086568341548]
First-time algorithm succeeds whenever $k$ is a constant.
We use a multi-scale analysis to argue that sufficiently close neurons can be collapsed together.
arXiv Detail & Related papers (2023-04-20T17:53:09Z) - Human-in-the-loop: Provably Efficient Preference-based Reinforcement
Learning with General Function Approximation [107.54516740713969]
We study human-in-the-loop reinforcement learning (RL) with trajectory preferences.
Instead of receiving a numeric reward at each step, the agent only receives preferences over trajectory pairs from a human overseer.
We propose the first optimistic model-based algorithm for PbRL with general function approximation.
arXiv Detail & Related papers (2022-05-23T09:03:24Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - Empirical Risk Minimization in the Non-interactive Local Model of
Differential Privacy [26.69391745812235]
We study the Empirical Risk Minimization (ERM) problem in the noninteractive Local Differential Privacy (LDP) model.
Previous research indicates that the sample complexity, to achieve error $alpha, needs to be depending on dimensionality $p$ for general loss functions.
arXiv Detail & Related papers (2020-11-11T17:48:00Z) - Online Linear Optimization with Many Hints [72.4277628722419]
We study an online linear optimization (OLO) problem in which the learner is provided access to $K$ "hint" vectors in each round prior to making a decision.
In this setting, we devise an algorithm that obtains logarithmic regret whenever there exists a convex combination of the $K$ hints that has positive correlation with the cost vectors.
arXiv Detail & Related papers (2020-10-06T23:25:05Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.