PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial
Attacks via Pairwise Adversarially Robust Loss Function
- URL: http://arxiv.org/abs/2112.04948v1
- Date: Thu, 9 Dec 2021 14:26:13 GMT
- Title: PARL: Enhancing Diversity of Ensemble Networks to Resist Adversarial
Attacks via Pairwise Adversarially Robust Loss Function
- Authors: Manaar Alam, Shubhajit Datta, Debdeep Mukhopadhyay, Arijit Mondal,
Partha Pratim Chakrabarti
- Abstract summary: adversarial attacks tend to rely on the principle of transferability.
Ensemble methods against adversarial attacks demonstrate that an adversarial example is less likely to mislead multiple classifiers.
Recent ensemble methods have either been shown to be vulnerable to stronger adversaries or shown to lack an end-to-end evaluation.
- Score: 13.417003144007156
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The security of Deep Learning classifiers is a critical field of study
because of the existence of adversarial attacks. Such attacks usually rely on
the principle of transferability, where an adversarial example crafted on a
surrogate classifier tends to mislead the target classifier trained on the same
dataset even if both classifiers have quite different architecture. Ensemble
methods against adversarial attacks demonstrate that an adversarial example is
less likely to mislead multiple classifiers in an ensemble having diverse
decision boundaries. However, recent ensemble methods have either been shown to
be vulnerable to stronger adversaries or shown to lack an end-to-end
evaluation. This paper attempts to develop a new ensemble methodology that
constructs multiple diverse classifiers using a Pairwise Adversarially Robust
Loss (PARL) function during the training procedure. PARL utilizes gradients of
each layer with respect to input in every classifier within the ensemble
simultaneously. The proposed training procedure enables PARL to achieve higher
robustness against black-box transfer attacks compared to previous ensemble
methods without adversely affecting the accuracy of clean examples. We also
evaluate the robustness in the presence of white-box attacks, where adversarial
examples are crafted using parameters of the target classifier. We present
extensive experiments using standard image classification datasets like
CIFAR-10 and CIFAR-100 trained using standard ResNet20 classifier against
state-of-the-art adversarial attacks to demonstrate the robustness of the
proposed ensemble methodology.
Related papers
- ZeroPur: Succinct Training-Free Adversarial Purification [52.963392510839284]
Adversarial purification is a kind of defense computation technique that can defend various unseen adversarial attacks.
We present a simple adversarial purification method without further training to purify adversarial images, called ZeroPur.
arXiv Detail & Related papers (2024-06-05T10:58:15Z) - Meta Invariance Defense Towards Generalizable Robustness to Unknown Adversarial Attacks [62.036798488144306]
Current defense mainly focuses on the known attacks, but the adversarial robustness to the unknown attacks is seriously overlooked.
We propose an attack-agnostic defense method named Meta Invariance Defense (MID)
We show that MID simultaneously achieves robustness to the imperceptible adversarial perturbations in high-level image classification and attack-suppression in low-level robust image regeneration.
arXiv Detail & Related papers (2024-04-04T10:10:38Z) - Carefully Blending Adversarial Training and Purification Improves Adversarial Robustness [1.2289361708127877]
CARSO is able to defend itself against adaptive end-to-end white-box attacks devised for defences.
Our method improves by a significant margin the state-of-the-art for CIFAR-10, CIFAR-100, and TinyImageNet-200.
arXiv Detail & Related papers (2023-05-25T09:04:31Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Adversarial Contrastive Learning by Permuting Cluster Assignments [0.8862707047517914]
We propose SwARo, an adversarial contrastive framework that incorporates cluster assignment permutations to generate representative adversarial samples.
We evaluate SwARo on multiple benchmark datasets and against various white-box and black-box attacks, obtaining consistent improvements over state-of-the-art baselines.
arXiv Detail & Related papers (2022-04-21T17:49:52Z) - Towards Compositional Adversarial Robustness: Generalizing Adversarial
Training to Composite Semantic Perturbations [70.05004034081377]
We first propose a novel method for generating composite adversarial examples.
Our method can find the optimal attack composition by utilizing component-wise projected gradient descent.
We then propose generalized adversarial training (GAT) to extend model robustness from $ell_p$-ball to composite semantic perturbations.
arXiv Detail & Related papers (2022-02-09T02:41:56Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - ATRO: Adversarial Training with a Rejection Option [10.36668157679368]
This paper proposes a classification framework with a rejection option to mitigate the performance deterioration caused by adversarial examples.
Applying the adversarial training objective to both a classifier and a rejection function simultaneously, we can choose to abstain from classification when it has insufficient confidence to classify a test data point.
arXiv Detail & Related papers (2020-10-24T14:05:03Z) - CD-UAP: Class Discriminative Universal Adversarial Perturbation [83.60161052867534]
A single universal adversarial perturbation (UAP) can be added to all natural images to change most of their predicted class labels.
We propose a new universal attack method to generate a single perturbation that fools a target network to misclassify only a chosen group of classes.
arXiv Detail & Related papers (2020-10-07T09:26:42Z) - Certified Robustness to Label-Flipping Attacks via Randomized Smoothing [105.91827623768724]
Machine learning algorithms are susceptible to data poisoning attacks.
We present a unifying view of randomized smoothing over arbitrary functions.
We propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
arXiv Detail & Related papers (2020-02-07T21:28:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.