A Characterization of Semi-Supervised Adversarially-Robust PAC Learnability
- URL: http://arxiv.org/abs/2202.05420v3
- Date: Sun, 5 May 2024 20:00:33 GMT
- Title: A Characterization of Semi-Supervised Adversarially-Robust PAC Learnability
- Authors: Idan Attias, Steve Hanneke, Yishay Mansour,
- Abstract summary: We study the problem of learning an adversarially robust predictor to test time attacks in the semi-supervised PAC model.
We show that there is a significant benefit in semi-supervised robust learning even in the worst-case distribution-free model.
- Score: 57.502573663108535
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the problem of learning an adversarially robust predictor to test time attacks in the semi-supervised PAC model. We address the question of how many labeled and unlabeled examples are required to ensure learning. We show that having enough unlabeled data (the size of a labeled sample that a fully-supervised method would require), the labeled sample complexity can be arbitrarily smaller compared to previous works, and is sharply characterized by a different complexity measure. We prove nearly matching upper and lower bounds on this sample complexity. This shows that there is a significant benefit in semi-supervised robust learning even in the worst-case distribution-free model, and establishes a gap between the supervised and semi-supervised label complexities which is known not to hold in standard non-robust PAC learning.
Related papers
- Transductive Learning Is Compact [10.168670899305232]
We show a compactness result holding broadly across supervised learning with a general class of loss functions.
For realizable learning with improper metric losses, we show that exact compactness of sample complexity can fail.
We conjecture that larger gaps are possible for the agnostic case.
arXiv Detail & Related papers (2024-02-15T23:10:45Z) - Virtual Category Learning: A Semi-Supervised Learning Method for Dense
Prediction with Extremely Limited Labels [63.16824565919966]
This paper proposes to use confusing samples proactively without label correction.
A Virtual Category (VC) is assigned to each confusing sample in such a way that it can safely contribute to the model optimisation.
Our intriguing findings highlight the usage of VC learning in dense vision tasks.
arXiv Detail & Related papers (2023-12-02T16:23:52Z) - MaxMatch: Semi-Supervised Learning with Worst-Case Consistency [149.03760479533855]
We propose a worst-case consistency regularization technique for semi-supervised learning (SSL)
We present a generalization bound for SSL consisting of the empirical loss terms observed on labeled and unlabeled training data separately.
Motivated by this bound, we derive an SSL objective that minimizes the largest inconsistency between an original unlabeled sample and its multiple augmented variants.
arXiv Detail & Related papers (2022-09-26T12:04:49Z) - Complementing Semi-Supervised Learning with Uncertainty Quantification [6.612035830987296]
We propose a novel unsupervised uncertainty-aware objective that relies on aleatoric and epistemic uncertainty quantification.
Our results outperform the state-of-the-art results on complex datasets such as CIFAR-100 and Mini-ImageNet.
arXiv Detail & Related papers (2022-07-22T00:15:02Z) - An analysis of over-sampling labeled data in semi-supervised learning
with FixMatch [66.34968300128631]
Most semi-supervised learning methods over-sample labeled data when constructing training mini-batches.
This paper studies whether this common practice improves learning and how.
We compare it to an alternative setting where each mini-batch is uniformly sampled from all the training data, labeled or not.
arXiv Detail & Related papers (2022-01-03T12:22:26Z) - Unsupervised Learning of Debiased Representations with Pseudo-Attributes [85.5691102676175]
We propose a simple but effective debiasing technique in an unsupervised manner.
We perform clustering on the feature embedding space and identify pseudoattributes by taking advantage of the clustering results.
We then employ a novel cluster-based reweighting scheme for learning debiased representation.
arXiv Detail & Related papers (2021-08-06T05:20:46Z) - Semi-verified PAC Learning from the Crowd [7.594050968868919]
We study the problem of crowdsourced PAC learning of threshold functions.
We show that under the semi-verified model of Charikar et al., it is possible to PAC learn the underlying hypothesis class with a manageable amount of label queries.
arXiv Detail & Related papers (2021-06-13T20:05:16Z) - Robust learning under clean-label attack [26.323812778809913]
We study the problem of robust learning under clean-label data-poisoning attacks.
The learning goal is to minimize the attackable rate, which is more difficult than optimal PAC learning.
arXiv Detail & Related papers (2021-03-01T00:21:15Z) - Multi-Complementary and Unlabeled Learning for Arbitrary Losses and
Models [6.177038245239757]
We propose a novel multi-complementary and unlabeled learning framework.
We first give an unbiased estimator of the classification risk from samples with multiple complementary labels.
We then further improve the estimator by incorporating unlabeled samples into the risk formulation.
arXiv Detail & Related papers (2020-01-13T13:52:54Z) - The Simulator: Understanding Adaptive Sampling in the
Moderate-Confidence Regime [52.38455827779212]
We propose a novel technique for analyzing adaptive sampling called the em Simulator.
We prove the first instance-based lower bounds the top-k problem which incorporate the appropriate log-factors.
Our new analysis inspires a simple and near-optimal for the best-arm and top-k identification, the first em practical of its kind for the latter problem.
arXiv Detail & Related papers (2017-02-16T23:42:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.