Robustness Against Adversarial Attacks via Learning Confined Adversarial
Polytopes
- URL: http://arxiv.org/abs/2401.07991v2
- Date: Sat, 20 Jan 2024 20:21:00 GMT
- Title: Robustness Against Adversarial Attacks via Learning Confined Adversarial
Polytopes
- Authors: Shayan Mohajer Hamidi, Linfeng Ye
- Abstract summary: Deep neural networks (DNNs) could be deceived by generating human-imperceptible perturbations of clean samples.
In this paper, we aim to train robust DNNs by limiting the set of outputs reachable via a norm-bounded perturbation added to a clean sample.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks (DNNs) could be deceived by generating
human-imperceptible perturbations of clean samples. Therefore, enhancing the
robustness of DNNs against adversarial attacks is a crucial task. In this
paper, we aim to train robust DNNs by limiting the set of outputs reachable via
a norm-bounded perturbation added to a clean sample. We refer to this set as
adversarial polytope, and each clean sample has a respective adversarial
polytope. Indeed, if the respective polytopes for all the samples are compact
such that they do not intersect the decision boundaries of the DNN, then the
DNN is robust against adversarial samples. Hence, the inner-working of our
algorithm is based on learning \textbf{c}onfined \textbf{a}dversarial
\textbf{p}olytopes (CAP). By conducting a thorough set of experiments, we
demonstrate the effectiveness of CAP over existing adversarial robustness
methods in improving the robustness of models against state-of-the-art attacks
including AutoAttack.
Related papers
- Robust Overfitting Does Matter: Test-Time Adversarial Purification With FGSM [5.592360872268223]
Defense strategies usually train deep neural networks (DNNs) for a specific adversarial attack method and can achieve good robustness in defense against this type of adversarial attack.
However, when subjected to evaluations involving unfamiliar attack modalities, empirical evidence reveals a pronounced deterioration in the robustness of DNNs.
Most defense methods often sacrifice the accuracy of clean examples in order to improve the adversarial robustness of DNNs.
arXiv Detail & Related papers (2024-03-18T03:54:01Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Not So Robust After All: Evaluating the Robustness of Deep Neural
Networks to Unseen Adversarial Attacks [5.024667090792856]
Deep neural networks (DNNs) have gained prominence in various applications, such as classification, recognition, and prediction.
A fundamental attribute of traditional DNNs is their vulnerability to modifications in input data, which has resulted in the investigation of adversarial attacks.
This study aims to challenge the efficacy and generalization of contemporary defense mechanisms against adversarial attacks.
arXiv Detail & Related papers (2023-08-12T05:21:34Z) - Latent Feature Relation Consistency for Adversarial Robustness [80.24334635105829]
misclassification will occur when deep neural networks predict adversarial examples which add human-imperceptible adversarial noise to natural examples.
We propose textbfLatent textbfFeature textbfRelation textbfConsistency (textbfLFRC)
LFRC constrains the relation of adversarial examples in latent space to be consistent with the natural examples.
arXiv Detail & Related papers (2023-03-29T13:50:01Z) - General Adversarial Defense Against Black-box Attacks via Pixel Level
and Feature Level Distribution Alignments [75.58342268895564]
We use Deep Generative Networks (DGNs) with a novel training mechanism to eliminate the distribution gap.
The trained DGNs align the distribution of adversarial samples with clean ones for the target DNNs by translating pixel values.
Our strategy demonstrates its unique effectiveness and generality against black-box attacks.
arXiv Detail & Related papers (2022-12-11T01:51:31Z) - Latent Boundary-guided Adversarial Training [61.43040235982727]
Adrial training is proved to be the most effective strategy that injects adversarial examples into model training.
We propose a novel adversarial training framework called LAtent bounDary-guided aDvErsarial tRaining.
arXiv Detail & Related papers (2022-06-08T07:40:55Z) - A Mask-Based Adversarial Defense Scheme [3.759725391906588]
Adversarial attacks hamper the functionality and accuracy of Deep Neural Networks (DNNs)
We propose a new Mask-based Adversarial Defense scheme (MAD) for DNNs to mitigate the negative effect from adversarial attacks.
arXiv Detail & Related papers (2022-04-21T12:55:27Z) - Policy Smoothing for Provably Robust Reinforcement Learning [109.90239627115336]
We study the provable robustness of reinforcement learning against norm-bounded adversarial perturbations of the inputs.
We generate certificates that guarantee that the total reward obtained by the smoothed policy will not fall below a certain threshold under a norm-bounded adversarial of perturbation the input.
arXiv Detail & Related papers (2021-06-21T21:42:08Z) - Ensemble Defense with Data Diversity: Weak Correlation Implies Strong
Robustness [15.185132265916106]
We propose a framework of filter-based ensemble of deep neuralnetworks (DNNs) to defend against adversarial attacks.
Our ensemble models are more robust than those constructed by previous defense methods like adversarial training.
arXiv Detail & Related papers (2021-06-05T10:56:48Z) - Attribute-Guided Adversarial Training for Robustness to Natural
Perturbations [64.35805267250682]
We propose an adversarial training approach which learns to generate new samples so as to maximize exposure of the classifier to the attributes-space.
Our approach enables deep neural networks to be robust against a wide range of naturally occurring perturbations.
arXiv Detail & Related papers (2020-12-03T10:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.