An Efficient Tester-Learner for Halfspaces
- URL: http://arxiv.org/abs/2302.14853v1
- Date: Tue, 28 Feb 2023 18:51:55 GMT
- Title: An Efficient Tester-Learner for Halfspaces
- Authors: Aravind Gollakota, Adam R. Klivans, Konstantinos Stavropoulos, Arsen
Vasilyan
- Abstract summary: We give the first efficient algorithm for learning halfspaces in the testable learning model recently defined by Rubinfeld and Vasilyan (2023)
In this model, a learner certifies that the accuracy of its output hypothesis is near optimal whenever the training set passes a test.
- Score: 13.13131953359806
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We give the first efficient algorithm for learning halfspaces in the testable
learning model recently defined by Rubinfeld and Vasilyan (2023). In this
model, a learner certifies that the accuracy of its output hypothesis is near
optimal whenever the training set passes an associated test, and training sets
drawn from some target distribution -- e.g., the Gaussian -- must pass the
test. This model is more challenging than distribution-specific agnostic or
Massart noise models where the learner is allowed to fail arbitrarily if the
distributional assumption does not hold.
We consider the setting where the target distribution is Gaussian (or more
generally any strongly log-concave distribution) in $d$ dimensions and the
noise model is either Massart or adversarial (agnostic). For Massart noise our
tester-learner runs in polynomial time and outputs a hypothesis with error
$\mathsf{opt} + \epsilon$, which is information-theoretically optimal. For
adversarial noise our tester-learner has error $\tilde{O}(\mathsf{opt}) +
\epsilon$ and runs in quasipolynomial time.
Prior work on testable learning ignores the labels in the training set and
checks that the empirical moments of the covariates are close to the moments of
the base distribution. Here we develop new tests of independent interest that
make critical use of the labels and combine them with the moment-matching
approach of Gollakota et al. (2023). This enables us to simulate a variant of
the algorithm of Diakonikolas et al. (2020) for learning noisy halfspaces using
nonconvex SGD but in the testable learning setting.
Related papers
- Tolerant Algorithms for Learning with Arbitrary Covariate Shift [18.37181965815327]
We study the problem of learning under arbitrary distribution shift, where the learner is trained on a labeled set from one distribution but evaluated on a different, potentially adversarially generated test distribution.
We focus on two frameworks: PQ learning [Goldwasser, A. Kalai, Y. Kalai, Montasser NeurIPS 2020], allowing abstention on adversarially generated parts of the test distribution, and TDS learning [Klivans, Stavropoulos, Vasilyan COLT 2024], permitting abstention on the entire test distribution if distribution shift is detected.
arXiv Detail & Related papers (2024-06-04T19:50:05Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift.
We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution.
We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z) - Efficient Testable Learning of Halfspaces with Adversarial Label Noise [44.32410859776227]
In the recently introduced testable learning model, one is required to produce a tester-learner such that if the data passes the tester, then one can trust the output of the robust learner on the data.
Our tester-learner runs in time $poly(d/eps)$ and outputs a halfspace with misclassification error $O(opt)+eps$, where $opt$ is the 0-1 error of the best fitting halfspace.
arXiv Detail & Related papers (2023-03-09T18:38:46Z) - Testing distributional assumptions of learning algorithms [5.204779946147061]
We study the design of tester-learner pairs $(mathcalA,mathcalT)$.
We show that if the distribution on examples in the data passes the tester $mathcalT$ then one can safely trust the output of the agnostic $mathcalA$ on the data.
arXiv Detail & Related papers (2022-04-14T19:10:53Z) - Test Set Sizing Via Random Matrix Theory [91.3755431537592]
This paper uses techniques from Random Matrix Theory to find the ideal training-testing data split for a simple linear regression.
It defines "ideal" as satisfying the integrity metric, i.e. the empirical model error is the actual measurement noise.
This paper is the first to solve for the training and test size for any model in a way that is truly optimal.
arXiv Detail & Related papers (2021-12-11T13:18:33Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z) - Learning Halfspaces with Tsybakov Noise [50.659479930171585]
We study the learnability of halfspaces in the presence of Tsybakov noise.
We give an algorithm that achieves misclassification error $epsilon$ with respect to the true halfspace.
arXiv Detail & Related papers (2020-06-11T14:25:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.