Can We Mitigate Backdoor Attack Using Adversarial Detection Methods?
- URL: http://arxiv.org/abs/2006.14871v2
- Date: Thu, 28 Jul 2022 18:22:05 GMT
- Title: Can We Mitigate Backdoor Attack Using Adversarial Detection Methods?
- Authors: Kaidi Jin, Tianwei Zhang, Chao Shen, Yufei Chen, Ming Fan, Chenhao
Lin, Ting Liu
- Abstract summary: We conduct comprehensive studies on the connections between adversarial examples and backdoor examples of Deep Neural Networks.
Our insights are based on the observation that both adversarial examples and backdoor examples have anomalies during the inference process.
We revise four existing adversarial defense methods for detecting backdoor examples.
- Score: 26.8404758315088
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks are well known to be vulnerable to adversarial attacks
and backdoor attacks, where minor modifications on the input are able to
mislead the models to give wrong results. Although defenses against adversarial
attacks have been widely studied, investigation on mitigating backdoor attacks
is still at an early stage. It is unknown whether there are any connections and
common characteristics between the defenses against these two attacks. We
conduct comprehensive studies on the connections between adversarial examples
and backdoor examples of Deep Neural Networks to seek to answer the question:
can we detect backdoor using adversarial detection methods. Our insights are
based on the observation that both adversarial examples and backdoor examples
have anomalies during the inference process, highly distinguishable from benign
samples. As a result, we revise four existing adversarial defense methods for
detecting backdoor examples. Extensive evaluations indicate that these
approaches provide reliable protection against backdoor attacks, with a higher
accuracy than detecting adversarial examples. These solutions also reveal the
relations of adversarial examples, backdoor examples and normal samples in
model sensitivity, activation space and feature space. This is able to enhance
our understanding about the inherent features of these two attacks and the
defense opportunities.
Related papers
- Towards Unified Robustness Against Both Backdoor and Adversarial Attacks [31.846262387360767]
Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks.
This paper reveals that there is an intriguing connection between backdoor and adversarial attacks.
A novel Progressive Unified Defense algorithm is proposed to defend against backdoor and adversarial attacks simultaneously.
arXiv Detail & Related papers (2024-05-28T07:50:00Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - FreeEagle: Detecting Complex Neural Trojans in Data-Free Cases [50.065022493142116]
Trojan attack on deep neural networks, also known as backdoor attack, is a typical threat to artificial intelligence.
FreeEagle is the first data-free backdoor detection method that can effectively detect complex backdoor attacks.
arXiv Detail & Related papers (2023-02-28T11:31:29Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Contributor-Aware Defenses Against Adversarial Backdoor Attacks [2.830541450812474]
adversarial backdoor attacks have demonstrated the capability to perform targeted misclassification of specific examples.
We propose a contributor-aware universal defensive framework for learning in the presence of multiple, potentially adversarial data sources.
Our empirical studies demonstrate the robustness of the proposed framework against adversarial backdoor attacks from multiple simultaneous adversaries.
arXiv Detail & Related papers (2022-05-28T20:25:34Z) - On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
A backdoored model always predicts a target class in the presence of a predefined trigger pattern.
In general, adversarial training is believed to defend against backdoor attacks.
We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
arXiv Detail & Related papers (2022-02-22T02:24:46Z) - TREATED:Towards Universal Defense against Textual Adversarial Attacks [28.454310179377302]
We propose TREATED, a universal adversarial detection method that can defend against attacks of various perturbation levels without making any assumptions.
Extensive experiments on three competitive neural networks and two widely used datasets show that our method achieves better detection performance than baselines.
arXiv Detail & Related papers (2021-09-13T03:31:20Z) - Towards Defending against Adversarial Examples via Attack-Invariant
Features [147.85346057241605]
Deep neural networks (DNNs) are vulnerable to adversarial noise.
adversarial robustness can be improved by exploiting adversarial examples.
Models trained on seen types of adversarial examples generally cannot generalize well to unseen types of adversarial examples.
arXiv Detail & Related papers (2021-06-09T12:49:54Z) - Backdoor Learning: A Survey [75.59571756777342]
Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs)
Backdoor learning is an emerging and rapidly growing research area.
This paper presents the first comprehensive survey of this realm.
arXiv Detail & Related papers (2020-07-17T04:09:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.