Realizable Learning is All You Need
- URL: http://arxiv.org/abs/2111.04746v4
- Date: Sat, 3 Feb 2024 00:55:16 GMT
- Title: Realizable Learning is All You Need
- Authors: Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan
- Abstract summary: equivalence of realizable and agnostic learnability is a fundamental phenomenon in learning theory.
We give the first model-independent framework explaining the equivalence of realizable and agnostic learnability.
- Score: 21.34668631009594
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The equivalence of realizable and agnostic learnability is a fundamental
phenomenon in learning theory. With variants ranging from classical settings
like PAC learning and regression to recent trends such as adversarially robust
learning, it's surprising that we still lack a unified theory; traditional
proofs of the equivalence tend to be disparate, and rely on strong
model-specific assumptions like uniform convergence and sample compression.
In this work, we give the first model-independent framework explaining the
equivalence of realizable and agnostic learnability: a three-line blackbox
reduction that simplifies, unifies, and extends our understanding across a wide
variety of settings. This includes models with no known characterization of
learnability such as learning with arbitrary distributional assumptions and
more general loss functions, as well as a host of other popular settings such
as robust learning, partial learning, fair learning, and the statistical query
model.
More generally, we argue that the equivalence of realizable and agnostic
learning is actually a special case of a broader phenomenon we call property
generalization: any desirable property of a learning algorithm (e.g. noise
tolerance, privacy, stability) that can be satisfied over finite hypothesis
classes extends (possibly in some variation) to any learnable hypothesis class.
Related papers
- Credal Learning Theory [4.64390130376307]
We lay the foundations for a credal' theory of learning, using convex sets of probabilities to model the variability in the data-generating distribution.
Bounds are derived for the case of finite hypotheses spaces, as well as infinite model spaces, which directly generalize classical results.
arXiv Detail & Related papers (2024-02-01T19:25:58Z) - On Continuity of Robust and Accurate Classifiers [3.8673630752805437]
It has been shown that adversarial training can improve the robustness of the hypothesis.
It has been suggested that robustness and accuracy of a hypothesis are at odds with each other.
In this paper, we put forth the alternative proposal that it is the continuity of a hypothesis that is incompatible with its robustness and accuracy.
arXiv Detail & Related papers (2023-09-29T08:14:25Z) - On the Joint Interaction of Models, Data, and Features [82.60073661644435]
We introduce a new tool, the interaction tensor, for empirically analyzing the interaction between data and model through features.
Based on these observations, we propose a conceptual framework for feature learning.
Under this framework, the expected accuracy for a single hypothesis and agreement for a pair of hypotheses can both be derived in closed-form.
arXiv Detail & Related papers (2023-06-07T21:35:26Z) - Learnability, Sample Complexity, and Hypothesis Class Complexity for
Regression Models [10.66048003460524]
This work is inspired by the foundation of PAC and is motivated by the existing regression learning issues.
The proposed approach, denoted by epsilon-Confidence Approximately Correct (epsilon CoAC), utilizes Kullback Leibler divergence (relative entropy)
It enables the learner to compare hypothesis classes of different complexity orders and choose among them the optimum with the minimum epsilon.
arXiv Detail & Related papers (2023-03-28T15:59:12Z) - A Low Rank Promoting Prior for Unsupervised Contrastive Learning [108.91406719395417]
We construct a novel probabilistic graphical model that effectively incorporates the low rank promoting prior into the framework of contrastive learning.
Our hypothesis explicitly requires that all the samples belonging to the same instance class lie on the same subspace with small dimension.
Empirical evidences show that the proposed algorithm clearly surpasses the state-of-the-art approaches on multiple benchmarks.
arXiv Detail & Related papers (2021-08-05T15:58:25Z) - Demystification of Few-shot and One-shot Learning [63.58514532659252]
Few-shot and one-shot learning have been the subject of active and intensive research in recent years.
We show that if the ambient or latent decision space of a learning machine is sufficiently high-dimensional than a large class of objects in this space can indeed be easily learned from few examples.
arXiv Detail & Related papers (2021-04-25T14:47:05Z) - Constrained Learning with Non-Convex Losses [119.8736858597118]
Though learning has become a core technology of modern information processing, there is now ample evidence that it can lead to biased, unsafe, and prejudiced solutions.
arXiv Detail & Related papers (2021-03-08T23:10:33Z) - Don't Just Blame Over-parametrization for Over-confidence: Theoretical
Analysis of Calibration in Binary Classification [58.03725169462616]
We show theoretically that over-parametrization is not the only reason for over-confidence.
We prove that logistic regression is inherently over-confident, in the realizable, under-parametrized setting.
Perhaps surprisingly, we also show that over-confidence is not always the case.
arXiv Detail & Related papers (2021-02-15T21:38:09Z) - A Theory of Universal Learning [26.51949485387526]
We show that there are only three possible rates of universal learning.
We show that the learning curves of any given concept class decay either at an exponential, or arbitrarily slow rates.
arXiv Detail & Related papers (2020-11-09T15:10:32Z) - Pairwise Supervision Can Provably Elicit a Decision Boundary [84.58020117487898]
Similarity learning is a problem to elicit useful representations by predicting the relationship between a pair of patterns.
We show that similarity learning is capable of solving binary classification by directly eliciting a decision boundary.
arXiv Detail & Related papers (2020-06-11T05:35:16Z) - On Learnability under General Stochastic Processes [20.22409095000365]
Statistical learning under general non-iid processes is less mature.
We provide two natural notions of learnability of a function class under a general process.
Our results hold for both binary classification and regression.
arXiv Detail & Related papers (2020-05-15T15:49:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.