Improved Algorithms for Efficient Active Learning Halfspaces with
Massart and Tsybakov noise
- URL: http://arxiv.org/abs/2102.05312v1
- Date: Wed, 10 Feb 2021 08:17:17 GMT
- Title: Improved Algorithms for Efficient Active Learning Halfspaces with
Massart and Tsybakov noise
- Authors: Chicheng Zhang and Yinan Li
- Abstract summary: We develop a PAC active learning algorithm for $d$-dimensional homogeneous halfspaces that can tolerate Massart noisecitepmassart2006risk and Tsybakov noiseciteptsybakov2004.
Under the more challenging Tsybakov noise condition, we identify two subfamilies of noise conditions, under which our algorithm achieves computational efficiency and provide label complexity guarantees strictly lower than passive learning algorithms.
- Score: 29.890039126644776
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We develop a computationally-efficient PAC active learning algorithm for
$d$-dimensional homogeneous halfspaces that can tolerate Massart
noise~\citep{massart2006risk} and Tsybakov noise~\citep{tsybakov2004optimal}.
Specialized to the $\eta$-Massart noise setting, our algorithm achieves an
information-theoretic optimal label complexity of $\tilde{O}\left(
\frac{d}{(1-2\eta)^2} \mathrm{polylog}(\frac1\epsilon) \right)$ under a wide
range of unlabeled data distributions (specifically, the family of "structured
distributions" defined in~\citet{diakonikolas2020polynomial}). Under the more
challenging Tsybakov noise condition, we identify two subfamilies of noise
conditions, under which our algorithm achieves computational efficiency and
provide label complexity guarantees strictly lower than passive learning
algorithms.
Related papers
- Perturb-and-Project: Differentially Private Similarities and Marginals [73.98880839337873]
We revisit the input perturbations framework for differential privacy where noise is added to the input $Ain mathcalS$.
We first design novel efficient algorithms to privately release pair-wise cosine similarities.
We derive a novel algorithm to compute $k$-way marginal queries over $n$ features.
arXiv Detail & Related papers (2024-06-07T12:07:16Z) - Efficient Active Learning Halfspaces with Tsybakov Noise: A Non-convex Optimization Approach [28.518140532315776]
We study the problem of computationally label efficient active learning.
In this paper, we prove any approximate.
first-order stationary light point of a structured un-labeled data set.
arXiv Detail & Related papers (2023-10-23T23:55:28Z) - Near-Optimal Bounds for Learning Gaussian Halfspaces with Random
Classification Noise [50.64137465792738]
We show that any efficient SQ algorithm for the problem requires sample complexity at least $Omega(d1/2/(maxp, epsilon)2)$.
Our lower bound suggests that this quadratic dependence on $1/epsilon$ is inherent for efficient algorithms.
arXiv Detail & Related papers (2023-07-13T18:59:28Z) - Sample-Optimal PAC Learning of Halfspaces with Malicious Noise [4.8728183994912415]
We study efficient PAC learning of halfspaces in $mathRd$ in the presence of malicious noise of Valiant(1985)
We present a new analysis for the algorithm of Awasthi et al.( 2017) and show that it essentially achieves the near-optimal sample complexity bound of $tildeO(d)$.
We extend the algorithm and analysis to the more general and stronger nasty noise model of Bbbshoutyetal (2002), showing that it is still possible to achieve near-optimal noise tolerance and sample complexity in time.
arXiv Detail & Related papers (2021-02-11T20:18:20Z) - A Polynomial Time Algorithm for Learning Halfspaces with Tsybakov Noise [55.45544638410858]
We study the problem of PAC learning homogeneous halfspaces in the presence of Tsybakov noise.
Our algorithm learns the true halfspace within any accuracy $epsilon$.
arXiv Detail & Related papers (2020-10-04T22:19:06Z) - The Complexity of Adversarially Robust Proper Learning of Halfspaces
with Agnostic Noise [67.27523616312428]
We study the computational complexity of adversarially robust proper learning of halfspaces in the distribution-independent PAC model.
We give a computationally efficient learning algorithm and a nearly matching computational hardness result for this problem.
arXiv Detail & Related papers (2020-07-30T04:18:51Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z) - Attribute-Efficient Learning of Halfspaces with Malicious Noise:
Near-Optimal Label Complexity and Noise Tolerance [21.76197397540397]
This paper is concerned with computationally efficient learning of homogeneous sparse halfspaces in $mathbbRd$ under noise.
We show that the sample complexity is $tildeObig(frac 1 epsilon s2 log5 d big)$ which also enjoys the attribute efficiency.
arXiv Detail & Related papers (2020-06-06T04:57:39Z) - Efficiently Learning Adversarially Robust Halfspaces with Noise [69.01459748050453]
We study the problem of learning adversarially robust halfspaces in the distribution-independent setting.
In the realizable setting, we provide necessary and sufficient conditions on the adversarial perturbation sets under which halfspaces are efficiently robustly learnable.
arXiv Detail & Related papers (2020-05-15T17:13:54Z) - Efficient active learning of sparse halfspaces with arbitrary bounded
noise [34.406103025985445]
We study active learning of homogeneous $s$-sparse halfspaces in $mathbbRd$ under the setting where the unlabeled data distribution is isotropic log-concave.
Under high levels of label noise, the label complexity bounds achieved by computationally efficient algorithms are much worse.
This is the first efficient algorithm with label complexity in $frac11-2eta$ in this setting, which is label-efficient even for $eta$ arbitrarily close to $frac12$.
arXiv Detail & Related papers (2020-02-12T08:28:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.