On the Power of Localized Perceptron for Label-Optimal Learning of
Halfspaces with Adversarial Noise
- URL: http://arxiv.org/abs/2012.10793v2
- Date: Wed, 20 Jan 2021 04:34:16 GMT
- Title: On the Power of Localized Perceptron for Label-Optimal Learning of
Halfspaces with Adversarial Noise
- Authors: Jie Shen
- Abstract summary: We study em online active learning of homogeneous halfspaces in $mathbbRd$ with adversarial noise.
Our main contribution is a Perceptron-like active learning algorithm that runs in time.
Our algorithm learns the underlying halfspace with near-optimal label complexity.
- Score: 4.8728183994912415
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study {\em online} active learning of homogeneous halfspaces in
$\mathbb{R}^d$ with adversarial noise where the overall probability of a noisy
label is constrained to be at most $\nu$. Our main contribution is a
Perceptron-like online active learning algorithm that runs in polynomial time,
and under the conditions that the marginal distribution is isotropic
log-concave and $\nu = \Omega(\epsilon)$, where $\epsilon \in (0, 1)$ is the
target error rate, our algorithm PAC learns the underlying halfspace with
near-optimal label complexity of $\tilde{O}\big(d \cdot
polylog(\frac{1}{\epsilon})\big)$ and sample complexity of
$\tilde{O}\big(\frac{d}{\epsilon} \big)$. Prior to this work, existing online
algorithms designed for tolerating the adversarial noise are subject to either
label complexity polynomial in $\frac{1}{\epsilon}$, or suboptimal noise
tolerance, or restrictive marginal distributions. With the additional prior
knowledge that the underlying halfspace is $s$-sparse, we obtain
attribute-efficient label complexity of $\tilde{O}\big( s \cdot polylog(d,
\frac{1}{\epsilon}) \big)$ and sample complexity of
$\tilde{O}\big(\frac{s}{\epsilon} \cdot polylog(d) \big)$. As an immediate
corollary, we show that under the agnostic model where no assumption is made on
the noise rate $\nu$, our active learner achieves an error rate of $O(OPT) +
\epsilon$ with the same running time and label and sample complexity, where
$OPT$ is the best possible error rate achievable by any homogeneous halfspace.
Related papers
- Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Attribute-Efficient Learning of Halfspaces with Malicious Noise:
Near-Optimal Label Complexity and Noise Tolerance [21.76197397540397]
This paper is concerned with computationally efficient learning of homogeneous sparse halfspaces in $mathbbRd$ under noise.
We show that the sample complexity is $tildeObig(frac 1 epsilon s2 log5 d big)$ which also enjoys the attribute efficiency.
arXiv Detail & Related papers (2020-06-06T04:57:39Z) - Efficient active learning of sparse halfspaces with arbitrary bounded
noise [34.406103025985445]
We study active learning of homogeneous $s$-sparse halfspaces in $mathbbRd$ under the setting where the unlabeled data distribution is isotropic log-concave.
Under high levels of label noise, the label complexity bounds achieved by computationally efficient algorithms are much worse.
This is the first efficient algorithm with label complexity in $frac11-2eta$ in this setting, which is label-efficient even for $eta$ arbitrarily close to $frac12$.
arXiv Detail & Related papers (2020-02-12T08:28:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.