RATT: Leveraging Unlabeled Data to Guarantee Generalization
- URL: http://arxiv.org/abs/2105.00303v1
- Date: Sat, 1 May 2021 17:05:29 GMT
- Title: RATT: Leveraging Unlabeled Data to Guarantee Generalization
- Authors: Saurabh Garg, Sivaraman Balakrishnan, J. Zico Kolter, Zachary C.
Lipton
- Abstract summary: We introduce a method that leverages unlabeled data to produce generalization bounds.
We prove that our bound is valid for 0-1 empirical risk minimization.
This work provides practitioners with an option for certifying the generalization of deep nets even when unseen labeled data is unavailable.
- Score: 96.08979093738024
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To assess generalization, machine learning scientists typically either (i)
bound the generalization gap and then (after training) plug in the empirical
risk to obtain a bound on the true risk; or (ii) validate empirically on
holdout data. However, (i) typically yields vacuous guarantees for
overparameterized models. Furthermore, (ii) shrinks the training set and its
guarantee erodes with each re-use of the holdout set. In this paper, we
introduce a method that leverages unlabeled data to produce generalization
bounds. After augmenting our (labeled) training set with randomly labeled fresh
examples, we train in the standard fashion. Whenever classifiers achieve low
error on clean data and high error on noisy data, our bound provides a tight
upper bound on the true risk. We prove that our bound is valid for 0-1
empirical risk minimization and with linear classifiers trained by gradient
descent. Our approach is especially useful in conjunction with deep learning
due to the early learning phenomenon whereby networks fit true labels before
noisy labels but requires one intuitive assumption. Empirically, on canonical
computer vision and NLP tasks, our bound provides non-vacuous generalization
guarantees that track actual performance closely. This work provides
practitioners with an option for certifying the generalization of deep nets
even when unseen labeled data is unavailable and provides theoretical insights
into the relationship between random label noise and generalization.
Related papers
- An Unbiased Risk Estimator for Partial Label Learning with Augmented Classes [46.663081214928226]
We propose an unbiased risk estimator with theoretical guarantees for PLLAC.
We provide a theoretical analysis of the estimation error bound of PLLAC.
Experiments on benchmark, UCI and real-world datasets demonstrate the effectiveness of the proposed approach.
arXiv Detail & Related papers (2024-09-29T07:36:16Z) - A Generalized Unbiased Risk Estimator for Learning with Augmented
Classes [70.20752731393938]
Given unlabeled data, an unbiased risk estimator (URE) can be derived, which can be minimized for LAC with theoretical guarantees.
We propose a generalized URE that can be equipped with arbitrary loss functions while maintaining the theoretical guarantees.
arXiv Detail & Related papers (2023-06-12T06:52:04Z) - Testing for Overfitting [0.0]
We discuss the overfitting problem and explain why standard and concentration results do not hold for evaluation with training data.
We introduce and argue for a hypothesis test by means of which both model performance may be evaluated using training data.
arXiv Detail & Related papers (2023-05-09T22:49:55Z) - Learning from Multiple Unlabeled Datasets with Partial Risk
Regularization [80.54710259664698]
In this paper, we aim to learn an accurate classifier without any class labels.
We first derive an unbiased estimator of the classification risk that can be estimated from the given unlabeled sets.
We then find that the classifier obtained as such tends to cause overfitting as its empirical risks go negative during training.
Experiments demonstrate that our method effectively mitigates overfitting and outperforms state-of-the-art methods for learning from multiple unlabeled sets.
arXiv Detail & Related papers (2022-07-04T16:22:44Z) - Benign Overfitting in Adversarially Robust Linear Classification [91.42259226639837]
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community.
We show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples.
arXiv Detail & Related papers (2021-12-31T00:27:31Z) - Risk Minimization from Adaptively Collected Data: Guarantees for
Supervised and Policy Learning [57.88785630755165]
Empirical risk minimization (ERM) is the workhorse of machine learning, but its model-agnostic guarantees can fail when we use adaptively collected data.
We study a generic importance sampling weighted ERM algorithm for using adaptively collected data to minimize the average of a loss function over a hypothesis class.
For policy learning, we provide rate-optimal regret guarantees that close an open gap in the existing literature whenever exploration decays to zero.
arXiv Detail & Related papers (2021-06-03T09:50:13Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.