Related papers: Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples

Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples

URL: http://arxiv.org/abs/2007.05145v3
Date: Wed, 30 Sep 2020 04:26:27 GMT
Title: Beyond Perturbations: Learning Guarantees with Arbitrary Adversarial Test Examples
Authors: Shafi Goldwasser, Adam Tauman Kalai, Yael Tauman Kalai, and Omar Montasser
Abstract summary: We give nontrivial guarantees for learning classes of bounded VC dimension with arbitrary train and test distributions. Our algorithm is efficient given an Empirical Risk Minimizer (ERM) for $C$.
Score: 17.11261628874086
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We present a transductive learning algorithm that takes as input training examples from a distribution $P$ and arbitrary (unlabeled) test examples, possibly chosen by an adversary. This is unlike prior work that assumes that test examples are small perturbations of $P$. Our algorithm outputs a selective classifier, which abstains from predicting on some examples. By considering selective transductive learning, we give the first nontrivial guarantees for learning classes of bounded VC dimension with arbitrary train and test distributions---no prior guarantees were known even for simple classes of functions such as intervals on the line. In particular, for any function in a class $C$ of bounded VC dimension, we guarantee a low test error rate and a low rejection rate with respect to $P$. Our algorithm is efficient given an Empirical Risk Minimizer (ERM) for $C$. Our guarantees hold even for test examples chosen by an unbounded white-box adversary. We also give guarantees for generalization, agnostic, and unsupervised settings.

Related papers

Distribution-Free Sequential Prediction with Abstentions [7.110627876507878]
We study a sequential prediction problem in which an adversary is allowed to inject arbitrarily many adversarial instances in a stream of i.i.d. instances.<n>At each round, the learner may also emphabstain from making a prediction without incurring any penalty if the instance was indeed corrupted.<n>We propose an textscAbstainBoost based on a boosting procedure of weak learners, which guarantees sublinear error for general VC classes.
arXiv Detail & Related papers (2026-02-20T00:28:27Z)
Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning. One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems. We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z)
Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift. We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution. We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z)
Active Learning Principles for In-Context Learning with Large Language Models [65.09970281795769]
This paper investigates how Active Learning algorithms can serve as effective demonstration selection methods for in-context learning. We show that in-context example selection through AL prioritizes high-quality examples that exhibit low uncertainty and bear similarity to the test examples.
arXiv Detail & Related papers (2023-05-23T17:16:04Z)
Realistic Evaluation of Transductive Few-Shot Learning [41.06192162435249]
Transductive inference is widely used in few-shot learning. We study the effect of arbitrary class distributions within the query sets of few-shot tasks at inference. We evaluate experimentally state-of-the-art transductive methods over 3 widely used data sets.
arXiv Detail & Related papers (2022-04-24T03:35:06Z)
Testing distributional assumptions of learning algorithms [5.204779946147061]
We study the design of tester-learner pairs $(mathcalA,mathcalT)$. We show that if the distribution on examples in the data passes the tester $mathcalT$ then one can safely trust the output of the agnostic $mathcalA$ on the data.
arXiv Detail & Related papers (2022-04-14T19:10:53Z)
Online Selective Classification with Limited Feedback [82.68009460301585]
We study selective classification in the online learning model, wherein a predictor may abstain from classifying an instance. Two salient aspects of the setting we consider are that the data may be non-realisable, due to which abstention may be a valid long-term action. We construct simple versioning-based schemes for any $mu in (0,1],$ that make most $Tmu$ mistakes while incurring smash$tildeO(T1-mu)$ excess abstention against adaptive adversaries.
arXiv Detail & Related papers (2021-10-27T08:00:53Z)
Learn then Test: Calibrating Predictive Algorithms to Achieve Risk Control [67.52000805944924]
Learn then Test (LTT) is a framework for calibrating machine learning models. Our main insight is to reframe the risk-control problem as multiple hypothesis testing. We use our framework to provide new calibration methods for several core machine learning tasks with detailed worked examples in computer vision.
arXiv Detail & Related papers (2021-10-03T17:42:03Z)
Towards optimally abstaining from prediction [22.937799541125607]
A common challenge across all areas of machine learning is that training data is not distributed like test data. We consider a model where one may abstain from predicting, at a fixed cost. Our work builds on a recent abstention algorithm of Goldwasser, Kalais, and Montasser ( 2020) for transductive binary classification.
arXiv Detail & Related papers (2021-05-28T21:44:48Z)
Robust learning under clean-label attack [26.323812778809913]
We study the problem of robust learning under clean-label data-poisoning attacks. The learning goal is to minimize the attackable rate, which is more difficult than optimal PAC learning.
arXiv Detail & Related papers (2021-03-01T00:21:15Z)
Binary classification with ambiguous training data [69.50862982117127]
In supervised learning, we often face with ambiguous (A) samples that are difficult to label even by domain experts. This problem is substantially different from semi-supervised learning since unlabeled samples are not necessarily difficult samples.
arXiv Detail & Related papers (2020-11-05T00:53:58Z)
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks. We present a unifying view of randomized smoothing over arbitrary functions. We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.