AED-PADA:Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation
- URL: http://arxiv.org/abs/2404.12635v1
- Date: Fri, 19 Apr 2024 05:32:37 GMT
- Title: AED-PADA:Improving Generalizability of Adversarial Example Detection via Principal Adversarial Domain Adaptation
- Authors: Heqi Peng, Yunhong Wang, Ruijie Yang, Beichen Li, Rui Wang, Yuanfang Guo,
- Abstract summary: We propose a novel method, named Adversarial Example Detection via Principal Adversarial Domain Adaptation (AED-PADA)
Specifically, our approach identifies the Principal Adversarial Domains (PADs)
Then, we pioneer to exploit multi-source domain adaptation in adversarial example detection with PADs as source domains.
- Score: 38.55694348512267
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial example detection, which can be conveniently applied in many scenarios, is important in the area of adversarial defense. Unfortunately, existing detection methods suffer from poor generalization performance, because their training process usually relies on the examples generated from a single known adversarial attack and there exists a large discrepancy between the training and unseen testing adversarial examples. To address this issue, we propose a novel method, named Adversarial Example Detection via Principal Adversarial Domain Adaptation (AED-PADA). Specifically, our approach identifies the Principal Adversarial Domains (PADs), i.e., a combination of features of the adversarial examples from different attacks, which possesses large coverage of the entire adversarial feature space. Then, we pioneer to exploit multi-source domain adaptation in adversarial example detection with PADs as source domains. Experiments demonstrate the superior generalization ability of our proposed AED-PADA. Note that this superiority is particularly achieved in challenging scenarios characterized by employing the minimal magnitude constraint for the perturbations.
Related papers
- Towards Black-box Adversarial Example Detection: A Data
Reconstruction-based Method [9.857570123016213]
Black-box attack is a more realistic threat and has led to various black-box adversarial training-based defense methods.
To tackle the BAD problem, we propose a data reconstruction-based adversarial example detection method.
arXiv Detail & Related papers (2023-06-03T06:34:17Z) - Adversarial Examples Detection with Enhanced Image Difference Features
based on Local Histogram Equalization [20.132066800052712]
We propose an adversarial example detection framework based on a high-frequency information enhancement strategy.
This framework can effectively extract and amplify the feature differences between adversarial examples and normal examples.
arXiv Detail & Related papers (2023-05-08T03:14:01Z) - AdvCheck: Characterizing Adversarial Examples via Local Gradient
Checking [3.425727850372357]
We introduce the concept of local gradient, and reveal that adversarial examples have a larger bound of local gradient than the benign ones.
Specifically, by calculating the local gradient from a few benign examples and noise-added misclassified examples to train a detector, adversarial examples and even misclassified natural inputs can be precisely distinguished from benign ones.
We have validated the AdvCheck's superior performance to the state-of-the-art (SOTA) baselines, with detection rate ($sim times 1.2$) on general adversarial attacks and ($sim times 1.4$) on misclassified natural inputs
arXiv Detail & Related papers (2023-03-25T17:46:09Z) - Few-shot Forgery Detection via Guided Adversarial Interpolation [56.59499187594308]
Existing forgery detection methods suffer from significant performance drops when applied to unseen novel forgery approaches.
We propose Guided Adversarial Interpolation (GAI) to overcome the few-shot forgery detection problem.
Our method is validated to be robust to choices of majority and minority forgery approaches.
arXiv Detail & Related papers (2022-04-12T16:05:10Z) - EAD: an ensemble approach to detect adversarial examples from the hidden
features of deep neural networks [1.3212032015497979]
We propose an Ensemble Adversarial Detector (EAD) for the identification of adversarial examples.
EAD combines multiple detectors that exploit distinct properties of the input instances in the internal representation of a pre-trained Deep Neural Network (DNN)
We show that EAD achieves the best AUROC and AUPR in the large majority of the settings and comparable performance in the others.
arXiv Detail & Related papers (2021-11-24T17:05:26Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Adversarially Robust One-class Novelty Detection [83.1570537254877]
We show that existing novelty detectors are susceptible to adversarial examples.
We propose a defense strategy that manipulates the latent space of novelty detectors to improve the robustness against adversarial examples.
arXiv Detail & Related papers (2021-08-25T10:41:29Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Learning to Separate Clusters of Adversarial Representations for Robust
Adversarial Detection [50.03939695025513]
We propose a new probabilistic adversarial detector motivated by a recently introduced non-robust feature.
In this paper, we consider the non-robust features as a common property of adversarial examples, and we deduce it is possible to find a cluster in representation space corresponding to the property.
This idea leads us to probability estimate distribution of adversarial representations in a separate cluster, and leverage the distribution for a likelihood based adversarial detector.
arXiv Detail & Related papers (2020-12-07T07:21:18Z) - Cross-domain Face Presentation Attack Detection via Multi-domain
Disentangled Representation Learning [109.42987031347582]
Face presentation attack detection (PAD) has been an urgent problem to be solved in the face recognition systems.
We propose an efficient disentangled representation learning for cross-domain face PAD.
Our approach consists of disentangled representation learning (DR-Net) and multi-domain learning (MD-Net)
arXiv Detail & Related papers (2020-04-04T15:45:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.