A Moment-Matching Approach to Testable Learning and a New
Characterization of Rademacher Complexity
- URL: http://arxiv.org/abs/2211.13312v1
- Date: Wed, 23 Nov 2022 21:29:51 GMT
- Title: A Moment-Matching Approach to Testable Learning and a New
Characterization of Rademacher Complexity
- Authors: Aravind Gollakota, Adam R. Klivans, Pravesh K. Kothari
- Abstract summary: We give a powerful new approach for developing algorithms for testable learning using tools from moment matching and metric agnostic in probability.
Surprisingly, we show that the information-theoretic complexity of testable learning is tightly characterized by the Rademacher complexity of the concept class.
- Score: 15.746613321517282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A remarkable recent paper by Rubinfeld and Vasilyan (2022) initiated the
study of \emph{testable learning}, where the goal is to replace hard-to-verify
distributional assumptions (such as Gaussianity) with efficiently testable ones
and to require that the learner succeed whenever the unknown distribution
passes the corresponding test. In this model, they gave an efficient algorithm
for learning halfspaces under testable assumptions that are provably satisfied
by Gaussians.
In this paper we give a powerful new approach for developing algorithms for
testable learning using tools from moment matching and metric distances in
probability. We obtain efficient testable learners for any concept class that
admits low-degree \emph{sandwiching polynomials}, capturing most important
examples for which we have ordinary agnostic learners. We recover the results
of Rubinfeld and Vasilyan as a corollary of our techniques while achieving
improved, near-optimal sample complexity bounds for a broad range of concept
classes and distributions.
Surprisingly, we show that the information-theoretic sample complexity of
testable learning is tightly characterized by the Rademacher complexity of the
concept class, one of the most well-studied measures in statistical learning
theory. In particular, uniform convergence is necessary and sufficient for
testable learning. This leads to a fundamental separation from (ordinary)
distribution-specific agnostic learning, where uniform convergence is
sufficient but not necessary.
Related papers
- Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning.
One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems.
We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Collaborative Learning with Different Labeling Functions [7.228285747845779]
We study a variant of Collaborative PAC Learning, in which we aim to learn an accurate classifier for each of the $n$ data distributions.
We show that, when the data distributions satisfy a weaker realizability assumption, sample-efficient learning is still feasible.
arXiv Detail & Related papers (2024-02-16T04:32:22Z) - Optimal Multi-Distribution Learning [88.3008613028333]
Multi-distribution learning seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions.
We propose a novel algorithm that yields an varepsilon-optimal randomized hypothesis with a sample complexity on the order of (d+k)/varepsilon2.
arXiv Detail & Related papers (2023-12-08T16:06:29Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Pairwise Learning via Stagewise Training in Proximal Setting [0.0]
We combine adaptive sample size and importance sampling techniques for pairwise learning, with convergence guarantees for nonsmooth convex pairwise loss functions.
We demonstrate that sampling opposite instances at each reduces the variance of the gradient, hence accelerating convergence.
arXiv Detail & Related papers (2022-08-08T11:51:01Z) - A Boosting Approach to Reinforcement Learning [59.46285581748018]
We study efficient algorithms for reinforcement learning in decision processes whose complexity is independent of the number of states.
We give an efficient algorithm that is capable of improving the accuracy of such weak learning methods.
arXiv Detail & Related papers (2021-08-22T16:00:45Z) - Effective Proximal Methods for Non-convex Non-smooth Regularized
Learning [27.775096437736973]
We show that the independent sampling scheme tends to improve performance of the commonly-used uniform sampling scheme.
Our new analysis also derives a speed for the sampling than best one available so far.
arXiv Detail & Related papers (2020-09-14T16:41:32Z) - Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers.
We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model.
Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.