Defending against Backdoor Attack on Deep Neural Networks
- URL: http://arxiv.org/abs/2002.12162v2
- Date: Mon, 21 Jun 2021 16:13:32 GMT
- Title: Defending against Backdoor Attack on Deep Neural Networks
- Authors: Kaidi Xu, Sijia Liu, Pin-Yu Chen, Pu Zhao, Xue Lin
- Abstract summary: We study the so-called textitbackdoor attack, which injects a backdoor trigger to a small portion of training data.
Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.
- Score: 98.45955746226106
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although deep neural networks (DNNs) have achieved a great success in various
computer vision tasks, it is recently found that they are vulnerable to
adversarial attacks. In this paper, we focus on the so-called \textit{backdoor
attack}, which injects a backdoor trigger to a small portion of training data
(also known as data poisoning) such that the trained DNN induces
misclassification while facing examples with this trigger. To be specific, we
carefully study the effect of both real and synthetic backdoor attacks on the
internal response of vanilla and backdoored DNNs through the lens of Gard-CAM.
Moreover, we show that the backdoor attack induces a significant bias in neuron
activation in terms of the $\ell_\infty$ norm of an activation map compared to
its $\ell_1$ and $\ell_2$ norm. Spurred by our results, we propose the
\textit{$\ell_\infty$-based neuron pruning} to remove the backdoor from the
backdoored DNN. Experiments show that our method could effectively decrease the
attack success rate, and also hold a high classification accuracy for clean
images.
Related papers
- BeniFul: Backdoor Defense via Middle Feature Analysis for Deep Neural Networks [0.6872939325656702]
We propose an effective and comprehensive backdoor defense method named BeniFul, which consists of two parts: a gray-box backdoor input detection and a white-box backdoor elimination.
Experimental results on CIFAR-10 and Tiny ImageNet against five state-of-the-art attacks demonstrate that our BeniFul exhibits a great defense capability in backdoor input detection and backdoor elimination.
arXiv Detail & Related papers (2024-10-15T13:14:55Z) - Reconstructive Neuron Pruning for Backdoor Defense [96.21882565556072]
We propose a novel defense called emphReconstructive Neuron Pruning (RNP) to expose and prune backdoor neurons.
In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure.
We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks.
arXiv Detail & Related papers (2023-05-24T08:29:30Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Look, Listen, and Attack: Backdoor Attacks Against Video Action
Recognition [53.720010650445516]
We show that poisoned-label image backdoor attacks could be extended temporally in two ways, statically and dynamically.
In addition, we explore natural video backdoors to highlight the seriousness of this vulnerability in the video domain.
And, for the first time, we study multi-modal (audiovisual) backdoor attacks against video action recognition models.
arXiv Detail & Related papers (2023-01-03T07:40:28Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - Imperceptible and Multi-channel Backdoor Attack against Deep Neural
Networks [9.931056642574454]
We propose a novel imperceptible and multi-channel backdoor attack against Deep Neural Networks.
Specifically, for a colored image, we utilize DCT steganography to construct the trigger on different channels of the image.
Experimental results demonstrate that the average attack success rate of the N-to-N backdoor attack is 93.95% on CIFAR-10 dataset and 91.55% on TinyImageNet dataset.
arXiv Detail & Related papers (2022-01-31T12:19:28Z) - Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural
Networks [24.532269628999025]
Backdoor (Trojan) attacks are emerging threats against deep neural networks (DNN)
In this paper, we propose an "in-flight" defense against backdoor attacks on image classification.
arXiv Detail & Related papers (2021-12-06T20:52:00Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Handcrafted Backdoors in Deep Neural Networks [33.21980707457639]
We introduce a handcrafted attack that directly manipulates the parameters of a pre-trained model to inject backdoors.
Our backdoors remain effective across four datasets and four network architectures with a success rate above 96%.
Our results suggest that further research is needed for understanding the complete space of supply-chain backdoor attacks.
arXiv Detail & Related papers (2021-06-08T20:58:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.