Challenging the adversarial robustness of DNNs based on error-correcting
output codes
- URL: http://arxiv.org/abs/2003.11855v2
- Date: Fri, 9 Oct 2020 02:45:39 GMT
- Title: Challenging the adversarial robustness of DNNs based on error-correcting
output codes
- Authors: Bowen Zhang, Benedetta Tondi, Xixiang Lv and Mauro Barni
- Abstract summary: ECOC-based networks can be attacked quite easily by introducing a small adversarial perturbation.
adversarial examples can be generated in such a way to achieve high probabilities for the predicted target class.
- Score: 33.46319608673487
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The existence of adversarial examples and the easiness with which they can be
generated raise several security concerns with regard to deep learning systems,
pushing researchers to develop suitable defense mechanisms. The use of networks
adopting error-correcting output codes (ECOC) has recently been proposed to
counter the creation of adversarial examples in a white-box setting. In this
paper, we carry out an in-depth investigation of the adversarial robustness
achieved by the ECOC approach. We do so by proposing a new adversarial attack
specifically designed for multi-label classification architectures, like the
ECOC-based one, and by applying two existing attacks. In contrast to previous
findings, our analysis reveals that ECOC-based networks can be attacked quite
easily by introducing a small adversarial perturbation. Moreover, the
adversarial examples can be generated in such a way to achieve high
probabilities for the predicted target class, hence making it difficult to use
the prediction confidence to detect them. Our findings are proven by means of
experimental results obtained on MNIST, CIFAR-10 and GTSRB classification
tasks.
Related papers
- Rethinking Targeted Adversarial Attacks For Neural Machine Translation [56.10484905098989]
This paper presents a new setting for NMT targeted adversarial attacks that could lead to reliable attacking results.
Under the new setting, it then proposes a Targeted Word Gradient adversarial Attack (TWGA) method to craft adversarial examples.
Experimental results demonstrate that our proposed setting could provide faithful attacking results for targeted adversarial attacks on NMT systems.
arXiv Detail & Related papers (2024-07-07T10:16:06Z) - Adversarial Attack Based on Prediction-Correction [8.467466998915018]
Deep neural networks (DNNs) are vulnerable to adversarial examples obtained by adding small perturbations to original examples.
In this paper, a new prediction-correction (PC) based adversarial attack is proposed.
In our proposed PC-based attack, some existing attack can be selected to produce a predicted example first, and then the predicted example and the current example are combined together to determine the added perturbations.
arXiv Detail & Related papers (2023-06-02T03:11:32Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - ADC: Adversarial attacks against object Detection that evade Context
consistency checks [55.8459119462263]
We show that even context consistency checks can be brittle to properly crafted adversarial examples.
We propose an adaptive framework to generate examples that subvert such defenses.
Our results suggest that how to robustly model context and check its consistency, is still an open problem.
arXiv Detail & Related papers (2021-10-24T00:25:09Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - Selective and Features based Adversarial Example Detection [12.443388374869745]
Security-sensitive applications that relay on Deep Neural Networks (DNNs) are vulnerable to small perturbations crafted to generate Adversarial Examples (AEs)
We propose a novel unsupervised detection mechanism that uses the selective prediction, processing model layers outputs, and knowledge transfer concepts in a multi-task learning setting.
Experimental results show that the proposed approach achieves comparable results to the state-of-the-art methods against tested attacks in white box scenario and better results in black and gray boxes scenarios.
arXiv Detail & Related papers (2021-03-09T11:06:15Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Adversarial Example Games [51.92698856933169]
Adrial Example Games (AEG) is a framework that models the crafting of adversarial examples.
AEG provides a new way to design adversarial examples by adversarially training a generator and aversa from a given hypothesis class.
We demonstrate the efficacy of AEG on the MNIST and CIFAR-10 datasets.
arXiv Detail & Related papers (2020-07-01T19:47:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.