Taming Adversarial Robustness via Abstaining
- URL: http://arxiv.org/abs/2104.02334v1
- Date: Tue, 6 Apr 2021 07:36:48 GMT
- Title: Taming Adversarial Robustness via Abstaining
- Authors: Abed AlRahman Al Makdah and Vaibhav Katewa and Fabio Pasqualetti
- Abstract summary: We consider a binary classification problem where the observations can be perturbed by an adversary.
We include an abstaining option, where the classifier abstains from taking a decision when it has low confidence about the prediction.
We show that there exist a tradeoff between the two metrics regardless of what method is used to choose the abstaining region.
- Score: 7.1975923901054575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work, we consider a binary classification problem and cast it into a
binary hypothesis testing framework, where the observations can be perturbed by
an adversary. To improve the adversarial robustness of a classifier, we include
an abstaining option, where the classifier abstains from taking a decision when
it has low confidence about the prediction. We propose metrics to quantify the
nominal performance of a classifier with abstaining option and its robustness
against adversarial perturbations. We show that there exist a tradeoff between
the two metrics regardless of what method is used to choose the abstaining
region. Our results imply that the robustness of a classifier with abstaining
can only be improved at the expense of its nominal performance. Further, we
provide necessary conditions to design the abstaining region for a
1-dimensional binary classification problem. We validate our theoretical
results on the MNIST dataset, where we numerically show that the tradeoff
between performance and robustness also exist for the general multi-class
classification problems.
Related papers
- Enhancing Robust Representation in Adversarial Training: Alignment and
Exclusion Criteria [61.048842737581865]
We show that Adversarial Training (AT) omits to learning robust features, resulting in poor performance of adversarial robustness.
We propose a generic framework of AT to gain robust representation, by the asymmetric negative contrast and reverse attention.
Empirical evaluations on three benchmark datasets show our methods greatly advance the robustness of AT and achieve state-of-the-art performance.
arXiv Detail & Related papers (2023-10-05T07:29:29Z) - Evaluating Adversarial Robustness with Expected Viable Performance [0.0]
We introduce a metric for evaluating the robustness of a classifier, with particular attention to adversarial perturbations.
A classifier is assumed to be non-functional (that is, has a functionality of zero) with respect to a perturbation bound if a conventional measure of performance, such as classification accuracy, is less than a minimally viable threshold.
arXiv Detail & Related papers (2023-09-18T16:47:24Z) - Counterfactually Comparing Abstaining Classifiers [37.43975777164451]
Abstaining classifiers have the option to abstain from making predictions on inputs that they are unsure about.
We introduce a novel approach to evaluating and comparing abstaining classifiers by treating abstentions as missing data.
arXiv Detail & Related papers (2023-05-17T20:46:57Z) - Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders.
Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency.
We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z) - Characterizing the Optimal 0-1 Loss for Multi-class Classification with
a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset.
We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z) - Selective Regression Under Fairness Criteria [30.672082160544996]
In some cases, the performance of minority group can decrease while we reduce the coverage.
We show that such an unwanted behavior can be avoided if we can construct features satisfying the sufficiency criterion.
arXiv Detail & Related papers (2021-10-28T19:05:12Z) - Exploring Robustness of Unsupervised Domain Adaptation in Semantic
Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space.
This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z) - Binary Classification from Multiple Unlabeled Datasets via Surrogate Set
Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$.
Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z) - Reachable Sets of Classifiers and Regression Models: (Non-)Robustness
Analysis and Robust Training [1.0878040851638]
We analyze and enhance robustness properties of both classifiers and regression models.
Specifically, we verify (non-)robustness, propose a robust training procedure, and show that our approach outperforms adversarial attacks.
Second, we provide techniques to distinguish between reliable and non-reliable predictions for unlabeled inputs, to quantify the influence of each feature on a prediction, and compute a feature ranking.
arXiv Detail & Related papers (2020-07-28T10:58:06Z) - Classifier-independent Lower-Bounds for Adversarial Robustness [13.247278149124757]
We theoretically analyse the limits of robustness to test-time adversarial and noisy examples in classification.
We use optimal transport theory to derive variational formulae for the Bayes-optimal error a classifier can make on a given classification problem.
We derive explicit lower-bounds on the Bayes-optimal error in the case of the popular distance-based attacks.
arXiv Detail & Related papers (2020-06-17T16:46:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.