Related papers: Taming Adversarial Robustness via Abstaining

Taming Adversarial Robustness via Abstaining

URL: http://arxiv.org/abs/2104.02334v1
Date: Tue, 6 Apr 2021 07:36:48 GMT
Title: Taming Adversarial Robustness via Abstaining
Authors: Abed AlRahman Al Makdah and Vaibhav Katewa and Fabio Pasqualetti
Abstract summary: We consider a binary classification problem where the observations can be perturbed by an adversary. We include an abstaining option, where the classifier abstains from taking a decision when it has low confidence about the prediction. We show that there exist a tradeoff between the two metrics regardless of what method is used to choose the abstaining region.
Score: 7.1975923901054575
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this work, we consider a binary classification problem and cast it into a binary hypothesis testing framework, where the observations can be perturbed by an adversary. To improve the adversarial robustness of a classifier, we include an abstaining option, where the classifier abstains from taking a decision when it has low confidence about the prediction. We propose metrics to quantify the nominal performance of a classifier with abstaining option and its robustness against adversarial perturbations. We show that there exist a tradeoff between the two metrics regardless of what method is used to choose the abstaining region. Our results imply that the robustness of a classifier with abstaining can only be improved at the expense of its nominal performance. Further, we provide necessary conditions to design the abstaining region for a 1-dimensional binary classification problem. We validate our theoretical results on the MNIST dataset, where we numerically show that the tradeoff between performance and robustness also exist for the general multi-class classification problems.

Related papers

TrustLoRA: Low-Rank Adaptation for Failure Detection under Out-of-distribution Data [62.22804234013273]
We propose a simple failure detection framework to unify and facilitate classification with rejection under both covariate and semantic shifts. Our key insight is that by separating and consolidating failure-specific reliability knowledge with low-rank adapters, we can enhance the failure detection ability effectively and flexibly.
arXiv Detail & Related papers (2025-04-20T09:20:55Z)
Probing Network Decisions: Capturing Uncertainties and Unveiling Vulnerabilities Without Label Information [19.50321703079894]
We present a novel framework to uncover the weakness of the classifier via counterfactual examples. We test the performance of our prober's misclassification detection and verify its effectiveness on the image classification benchmark datasets.
arXiv Detail & Related papers (2025-03-12T05:05:58Z)
Enhancing Robust Representation in Adversarial Training: Alignment and Exclusion Criteria [61.048842737581865]
We show that Adversarial Training (AT) omits to learning robust features, resulting in poor performance of adversarial robustness. We propose a generic framework of AT to gain robust representation, by the asymmetric negative contrast and reverse attention. Empirical evaluations on three benchmark datasets show our methods greatly advance the robustness of AT and achieve state-of-the-art performance.
arXiv Detail & Related papers (2023-10-05T07:29:29Z)
Evaluating Adversarial Robustness with Expected Viable Performance [0.0]
We introduce a metric for evaluating the robustness of a classifier, with particular attention to adversarial perturbations. A classifier is assumed to be non-functional (that is, has a functionality of zero) with respect to a perturbation bound if a conventional measure of performance, such as classification accuracy, is less than a minimally viable threshold.
arXiv Detail & Related papers (2023-09-18T16:47:24Z)
Counterfactually Comparing Abstaining Classifiers [37.43975777164451]
Abstaining classifiers have the option to abstain from making predictions on inputs that they are unsure about. We introduce a novel approach to evaluating and comparing abstaining classifiers by treating abstentions as missing data.
arXiv Detail & Related papers (2023-05-17T20:46:57Z)
Variational Classification [51.2541371924591]
We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency. We induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer.
arXiv Detail & Related papers (2023-05-17T17:47:19Z)
Characterizing the Optimal 0-1 Loss for Multi-class Classification with a Test-time Attacker [57.49330031751386]
We find achievable information-theoretic lower bounds on loss in the presence of a test-time attacker for multi-class classifiers on any discrete dataset. We provide a general framework for finding the optimal 0-1 loss that revolves around the construction of a conflict hypergraph from the data and adversarial constraints.
arXiv Detail & Related papers (2023-02-21T15:17:13Z)
Selective Regression Under Fairness Criteria [30.672082160544996]
In some cases, the performance of minority group can decrease while we reduce the coverage. We show that such an unwanted behavior can be avoided if we can construct features satisfying the sufficiency criterion.
arXiv Detail & Related papers (2021-10-28T19:05:12Z)
Exploring Robustness of Unsupervised Domain Adaptation in Semantic Segmentation [74.05906222376608]
We propose adversarial self-supervision UDA (or ASSUDA) that maximizes the agreement between clean images and their adversarial examples by a contrastive loss in the output space. This paper is rooted in two observations: (i) the robustness of UDA methods in semantic segmentation remains unexplored, which pose a security concern in this field; and (ii) although commonly used self-supervision (e.g., rotation and jigsaw) benefits image tasks such as classification and recognition, they fail to provide the critical supervision signals that could learn discriminative representation for segmentation tasks.
arXiv Detail & Related papers (2021-05-23T01:50:44Z)
Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification [94.55805516167369]
We propose a new approach for binary classification from m U-sets for $mge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC)
arXiv Detail & Related papers (2021-02-01T07:36:38Z)
Exploiting Sample Uncertainty for Domain Adaptive Person Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels. Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
Reachable Sets of Classifiers and Regression Models: (Non-)Robustness Analysis and Robust Training [1.0878040851638]
We analyze and enhance robustness properties of both classifiers and regression models. Specifically, we verify (non-)robustness, propose a robust training procedure, and show that our approach outperforms adversarial attacks. Second, we provide techniques to distinguish between reliable and non-reliable predictions for unlabeled inputs, to quantify the influence of each feature on a prediction, and compute a feature ranking.
arXiv Detail & Related papers (2020-07-28T10:58:06Z)
Classifier-independent Lower-Bounds for Adversarial Robustness [13.247278149124757]
We theoretically analyse the limits of robustness to test-time adversarial and noisy examples in classification. We use optimal transport theory to derive variational formulae for the Bayes-optimal error a classifier can make on a given classification problem. We derive explicit lower-bounds on the Bayes-optimal error in the case of the popular distance-based attacks.
arXiv Detail & Related papers (2020-06-17T16:46:39Z)

This list is automatically generated from the titles and abstracts of the papers in this site.