Saliency Diversified Deep Ensemble for Robustness to Adversaries
- URL: http://arxiv.org/abs/2112.03615v1
- Date: Tue, 7 Dec 2021 10:18:43 GMT
- Title: Saliency Diversified Deep Ensemble for Robustness to Adversaries
- Authors: Alex Bogun, Dimche Kostadinov, Damian Borth
- Abstract summary: This work proposes a novel diversity-promoting learning approach for the deep ensembles.
The idea is to promote saliency map diversity (SMD) on ensemble members to prevent the attacker from targeting all ensemble members at once.
We empirically show a reduced transferability between ensemble members and improved performance compared to the state-of-the-art ensemble defense.
- Score: 1.9659095632676094
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models have shown incredible performance on numerous image
recognition, classification, and reconstruction tasks. Although very appealing
and valuable due to their predictive capabilities, one common threat remains
challenging to resolve. A specifically trained attacker can introduce malicious
input perturbations to fool the network, thus causing potentially harmful
mispredictions. Moreover, these attacks can succeed when the adversary has full
access to the target model (white-box) and even when such access is limited
(black-box setting). The ensemble of models can protect against such attacks
but might be brittle under shared vulnerabilities in its members (attack
transferability). To that end, this work proposes a novel diversity-promoting
learning approach for the deep ensembles. The idea is to promote saliency map
diversity (SMD) on ensemble members to prevent the attacker from targeting all
ensemble members at once by introducing an additional term in our learning
objective. During training, this helps us minimize the alignment between model
saliencies to reduce shared member vulnerabilities and, thus, increase ensemble
robustness to adversaries. We empirically show a reduced transferability
between ensemble members and improved performance compared to the
state-of-the-art ensemble defense against medium and high strength white-box
attacks. In addition, we demonstrate that our approach combined with existing
methods outperforms state-of-the-art ensemble algorithms for defense under
white-box and black-box attacks.
Related papers
- Edge-Only Universal Adversarial Attacks in Distributed Learning [49.546479320670464]
In this work, we explore the feasibility of generating universal adversarial attacks when an attacker has access to the edge part of the model only.
Our approach shows that adversaries can induce effective mispredictions in the unknown cloud part by leveraging key features on the edge side.
Our results on ImageNet demonstrate strong attack transferability to the unknown cloud part.
arXiv Detail & Related papers (2024-11-15T11:06:24Z) - Multi-granular Adversarial Attacks against Black-box Neural Ranking Models [111.58315434849047]
We create high-quality adversarial examples by incorporating multi-granular perturbations.
We transform the multi-granular attack into a sequential decision-making process.
Our attack method surpasses prevailing baselines in both attack effectiveness and imperceptibility.
arXiv Detail & Related papers (2024-04-02T02:08:29Z) - Understanding and Improving Ensemble Adversarial Defense [4.504026914523449]
We develop a new error theory dedicated to understanding ensemble adversarial defense.
We propose an effective approach to improve ensemble adversarial defense, named interactive global adversarial training (iGAT)
iGAT is capable of boosting their performance by increases up to 17% evaluated using CIFAR10 and CIFAR100 datasets under both white-box and black-box attacks.
arXiv Detail & Related papers (2023-10-27T20:43:29Z) - Understanding the Vulnerability of Skeleton-based Human Activity Recognition via Black-box Attack [53.032801921915436]
Human Activity Recognition (HAR) has been employed in a wide range of applications, e.g. self-driving cars.
Recently, the robustness of skeleton-based HAR methods have been questioned due to their vulnerability to adversarial attacks.
We show such threats exist, even when the attacker only has access to the input/output of the model.
We propose the very first black-box adversarial attack approach in skeleton-based HAR called BASAR.
arXiv Detail & Related papers (2022-11-21T09:51:28Z) - Resisting Adversarial Attacks in Deep Neural Networks using Diverse
Decision Boundaries [12.312877365123267]
Deep learning systems are vulnerable to crafted adversarial examples, which may be imperceptible to the human eye, but can lead the model to misclassify.
We develop a new ensemble-based solution that constructs defender models with diverse decision boundaries with respect to the original model.
We present extensive experimentations using standard image classification datasets, namely MNIST, CIFAR-10 and CIFAR-100 against state-of-the-art adversarial attacks.
arXiv Detail & Related papers (2022-08-18T08:19:26Z) - Stochastic Variance Reduced Ensemble Adversarial Attack for Boosting the
Adversarial Transferability [20.255708227671573]
Black-box adversarial attacks can be transferred from one model to another.
In this work, we propose a novel ensemble attack method called the variance reduced ensemble attack.
Empirical results on the standard ImageNet demonstrate that the proposed method could boost the adversarial transferability and outperforms existing ensemble attacks significantly.
arXiv Detail & Related papers (2021-11-21T06:33:27Z) - Towards A Conceptually Simple Defensive Approach for Few-shot
classifiers Against Adversarial Support Samples [107.38834819682315]
We study a conceptually simple approach to defend few-shot classifiers against adversarial attacks.
We propose a simple attack-agnostic detection method, using the concept of self-similarity and filtering.
Our evaluation on the miniImagenet (MI) and CUB datasets exhibit good attack detection performance.
arXiv Detail & Related papers (2021-10-24T05:46:03Z) - Combating Adversaries with Anti-Adversaries [118.70141983415445]
In particular, our layer generates an input perturbation in the opposite direction of the adversarial one.
We verify the effectiveness of our approach by combining our layer with both nominally and robustly trained models.
Our anti-adversary layer significantly enhances model robustness while coming at no cost on clean accuracy.
arXiv Detail & Related papers (2021-03-26T09:36:59Z) - "What's in the box?!": Deflecting Adversarial Attacks by Randomly
Deploying Adversarially-Disjoint Models [71.91835408379602]
adversarial examples have been long considered a real threat to machine learning models.
We propose an alternative deployment-based defense paradigm that goes beyond the traditional white-box and black-box threat models.
arXiv Detail & Related papers (2021-02-09T20:07:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.