Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural
Networks
- URL: http://arxiv.org/abs/2112.03350v1
- Date: Mon, 6 Dec 2021 20:52:00 GMT
- Title: Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural
Networks
- Authors: Xi Li and Zhen Xiang and David J. Miller and George Kesidis
- Abstract summary: Backdoor (Trojan) attacks are emerging threats against deep neural networks (DNN)
In this paper, we propose an "in-flight" defense against backdoor attacks on image classification.
- Score: 24.532269628999025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor (Trojan) attacks are emerging threats against deep neural networks
(DNN). A DNN being attacked will predict to an attacker-desired target class
whenever a test sample from any source class is embedded with a backdoor
pattern; while correctly classifying clean (attack-free) test samples. Existing
backdoor defenses have shown success in detecting whether a DNN is attacked and
in reverse-engineering the backdoor pattern in a "post-training" regime: the
defender has access to the DNN to be inspected and a small, clean dataset
collected independently, but has no access to the (possibly poisoned) training
set of the DNN. However, these defenses neither catch culprits in the act of
triggering the backdoor mapping, nor mitigate the backdoor attack at test-time.
In this paper, we propose an "in-flight" defense against backdoor attacks on
image classification that 1) detects use of a backdoor trigger at test-time;
and 2) infers the class of origin (source class) for a detected trigger
example. The effectiveness of our defense is demonstrated experimentally
against different strong backdoor attacks.
Related papers
- BeniFul: Backdoor Defense via Middle Feature Analysis for Deep Neural Networks [0.6872939325656702]
We propose an effective and comprehensive backdoor defense method named BeniFul, which consists of two parts: a gray-box backdoor input detection and a white-box backdoor elimination.
Experimental results on CIFAR-10 and Tiny ImageNet against five state-of-the-art attacks demonstrate that our BeniFul exhibits a great defense capability in backdoor input detection and backdoor elimination.
arXiv Detail & Related papers (2024-10-15T13:14:55Z) - Beating Backdoor Attack at Its Own Game [10.131734154410763]
Deep neural networks (DNNs) are vulnerable to backdoor attack.
Existing defense methods have greatly reduced attack success rate.
We propose a highly effective framework which injects non-adversarial backdoors targeting poisoned samples.
arXiv Detail & Related papers (2023-07-28T13:07:42Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - MM-BD: Post-Training Detection of Backdoor Attacks with Arbitrary
Backdoor Pattern Types Using a Maximum Margin Statistic [27.62279831135902]
We propose a post-training defense that detects backdoor attacks with arbitrary types of backdoor embeddings.
Our detector does not need any legitimate clean samples, and can efficiently detect backdoor attacks with arbitrary numbers of source classes.
arXiv Detail & Related papers (2022-05-13T21:32:24Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z) - Rethinking the Trigger of Backdoor Attack [83.98031510668619]
Currently, most of existing backdoor attacks adopted the setting of emphstatic trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area.
We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2020-04-09T17:19:37Z) - Defending against Backdoor Attack on Deep Neural Networks [98.45955746226106]
We study the so-called textitbackdoor attack, which injects a backdoor trigger to a small portion of training data.
Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.
arXiv Detail & Related papers (2020-02-26T02:03:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.