Do PAC-Learners Learn the Marginal Distribution?
- URL: http://arxiv.org/abs/2302.06285v1
- Date: Mon, 13 Feb 2023 11:42:58 GMT
- Title: Do PAC-Learners Learn the Marginal Distribution?
- Authors: Max Hopkins, Daniel M. Kane, Shachar Lovett, Gaurav Mahajan
- Abstract summary: We study a variant of PAC-Learning in which the adversary is restricted to a known family of marginal distributions $mathscrP$.
We show that TV-learning is emphequivalent to PAC-Learning.
- Score: 24.80812483480747
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study a foundational variant of Valiant and Vapnik and Chervonenkis'
Probably Approximately Correct (PAC)-Learning in which the adversary is
restricted to a known family of marginal distributions $\mathscr{P}$. In
particular, we study how the PAC-learnability of a triple $(\mathscr{P},X,H)$
relates to the learners ability to infer \emph{distributional} information
about the adversary's choice of $D \in \mathscr{P}$. To this end, we introduce
the `unsupervised' notion of \emph{TV-Learning}, which, given a class
$(\mathscr{P},X,H)$, asks the learner to approximate $D$ from unlabeled samples
with respect to a natural class-conditional total variation metric.
In the classical distribution-free setting, we show that TV-learning is
\emph{equivalent} to PAC-Learning: in other words, any learner must infer
near-maximal information about $D$. On the other hand, we show this
characterization breaks down for general $\mathscr{P}$, where PAC-Learning is
strictly sandwiched between two approximate variants we call `Strong' and
`Weak' TV-learning, roughly corresponding to unsupervised learners that
estimate most relevant distances in $D$ with respect to $H$, but differ in
whether the learner \emph{knows} the set of well-estimated events. Finally, we
observe that TV-learning is in fact equivalent to the classical notion of
\emph{uniform estimation}, and thereby give a strong refutation of the uniform
convergence paradigm in supervised learning.
Related papers
- On Characterizing and Mitigating Imbalances in Multi-Instance Partial Label Learning [57.18649648182171]
We make contributions towards addressing a problem that hasn't been studied so far in the context of MI-PLL.
We derive class-specific risk bounds for MI-PLL, while making minimal assumptions.
Our theory reveals a unique phenomenon: that $sigma$ can greatly impact learning imbalances.
arXiv Detail & Related papers (2024-07-13T20:56:34Z) - Fast Rates for Bandit PAC Multiclass Classification [73.17969992976501]
We study multiclass PAC learning with bandit feedback, where inputs are classified into one of $K$ possible labels and feedback is limited to whether or not the predicted labels are correct.
Our main contribution is in designing a novel learning algorithm for the agnostic $(varepsilon,delta)$PAC version of the problem.
arXiv Detail & Related papers (2024-06-18T08:54:04Z) - Sharp Rates in Dependent Learning Theory: Avoiding Sample Size Deflation for the Square Loss [33.18537822803389]
We show that whenever the topologies of $L2$ and $Psi_p$ are comparable on our hypothesis class $mathscrF$, $mathscrF$ is a weakly sub-Gaussian class.
Our result holds whether the problem is realizable or not and we refer to this as a emphnear mixing-free rate, since direct dependence on mixing is relegated to an additive higher order term.
arXiv Detail & Related papers (2024-02-08T18:57:42Z) - The Sample Complexity of Multi-Distribution Learning for VC Classes [25.73730126599202]
Multi-distribution learning is a generalization of PAC learning to settings with multiple data distributions.
There remains a significant gap between the known upper and lower bounds for PAC-learnable classes.
We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.
arXiv Detail & Related papers (2023-07-22T18:02:53Z) - On Optimal Learning Under Targeted Data Poisoning [48.907813854832206]
In this work we aim to characterize the smallest achievable error $epsilon=epsilon(eta)$ by the learner in the presence of such an adversary.
Remarkably, we show that the upper bound can be attained by a deterministic learner.
arXiv Detail & Related papers (2022-10-06T06:49:48Z) - Cryptographic Hardness of Learning Halfspaces with Massart Noise [59.8587499110224]
We study the complexity of PAC learning halfspaces in the presence of Massart noise.
We show that no-time Massart halfspace learners can achieve error better than $Omega(eta)$, even if the optimal 0-1 error is small.
arXiv Detail & Related papers (2022-07-28T17:50:53Z) - Provable Robustness of Adversarial Training for Learning Halfspaces with
Noise [95.84614821570283]
We analyze the properties of adversarial learning adversarially robust halfspaces in the presence of label noise.
To the best of our knowledge, this is the first work to show that adversarial training prov yields classifiers in noise.
arXiv Detail & Related papers (2021-04-19T16:35:38Z) - On Agnostic PAC Learning using $\mathcal{L}_2$-polynomial Regression and
Fourier-based Algorithms [10.66048003460524]
We develop a framework using Hilbert spaces as a proxy to analyze PAC learning problems with structural properties.
We demonstrate that PAC learning with 0-1 loss is equivalent to an optimization in the Hilbert space domain.
arXiv Detail & Related papers (2021-02-11T21:28:55Z) - Hardness of Learning Halfspaces with Massart Noise [56.98280399449707]
We study the complexity of PAC learning halfspaces in the presence of Massart (bounded) noise.
We show that there an exponential gap between the information-theoretically optimal error and the best error that can be achieved by a SQ algorithm.
arXiv Detail & Related papers (2020-12-17T16:43:11Z) - Reducing Adversarially Robust Learning to Non-Robust PAC Learning [39.51923275855131]
We give a reduction that can robustly learn any hypothesis class using any non-robust learner.
The number of calls to $mathcalA$ depends logarithmically on the number of allowed adversarial perturbations per example.
arXiv Detail & Related papers (2020-10-22T20:28:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.