Related papers: HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios

HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios

URL: http://arxiv.org/abs/2012.07474v1
Date: Mon, 14 Dec 2020 12:47:41 GMT
Title: HaS-Nets: A Heal and Select Mechanism to Defend DNNs Against Backdoor Attacks for Data Collection Scenarios
Authors: Hassan Ali, Surya Nepal, Salil S. Kanhere and Sanjay Jha
Abstract summary: "Low-confidence backdoor attack" exploits confidence labels assigned to poisoned training samples. "HaS-Nets" can decrease ASRs from over 90% to less than 15%, independent of the dataset.
Score: 23.898803100714957
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We have witnessed the continuing arms race between backdoor attacks and the corresponding defense strategies on Deep Neural Networks (DNNs). Most state-of-the-art defenses rely on the statistical sanitization of the "inputs" or "latent DNN representations" to capture trojan behaviour. In this paper, we first challenge the robustness of such recently reported defenses by introducing a novel variant of targeted backdoor attack, called "low-confidence backdoor attack". We also propose a novel defense technique, called "HaS-Nets". "Low-confidence backdoor attack" exploits the confidence labels assigned to poisoned training samples by giving low values to hide their presence from the defender, both during training and inference. We evaluate the attack against four state-of-the-art defense methods, viz., STRIP, Gradient-Shaping, Februus and ULP-defense, and achieve Attack Success Rate (ASR) of 99%, 63.73%, 91.2% and 80%, respectively. We next present "HaS-Nets" to resist backdoor insertion in the network during training, using a reasonably small healing dataset, approximately 2% to 15% of full training data, to heal the network at each iteration. We evaluate it for different datasets - Fashion-MNIST, CIFAR-10, Consumer Complaint and Urban Sound - and network architectures - MLPs, 2D-CNNs, 1D-CNNs. Our experiments show that "HaS-Nets" can decrease ASRs from over 90% to less than 15%, independent of the dataset, attack configuration and network architecture.

Related papers

Data Free Backdoor Attacks [83.10379074100453]
DFBA is a retraining-free and data-free backdoor attack without changing the model architecture. We verify that our injected backdoor is provably undetectable and unchosen by various state-of-the-art defenses. Our evaluation on multiple datasets demonstrates that our injected backdoor: 1) incurs negligible classification loss, 2) achieves 100% attack success rates, and 3) bypasses six existing state-of-the-art defenses.
arXiv Detail & Related papers (2024-12-09T05:30:25Z)
UNIT: Backdoor Mitigation via Automated Neural Distribution Tightening [43.09750187130803]
Deep neural networks (DNNs) have demonstrated effectiveness in various fields. DNNs are vulnerable to backdoor attacks, which inject a unique pattern, called trigger, into the input to cause misclassification to an attack-chosen target label. In this paper, we introduce a novel post-training defense technique that can effectively eliminate backdoor effects for a variety of attacks.
arXiv Detail & Related papers (2024-07-16T04:33:05Z)
Beating Backdoor Attack at Its Own Game [10.131734154410763]
Deep neural networks (DNNs) are vulnerable to backdoor attack. Existing defense methods have greatly reduced attack success rate. We propose a highly effective framework which injects non-adversarial backdoors targeting poisoned samples.
arXiv Detail & Related papers (2023-07-28T13:07:42Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Backdoor Defense via Adaptively Splitting Poisoned Dataset [57.70673801469096]
Backdoor defenses have been studied to alleviate the threat of deep neural networks (DNNs) being backdoor attacked and maliciously altered. We argue that the core of training-time defense is to select poisoned samples and to handle them properly. Under our framework, we propose an adaptively splitting dataset-based defense (ASD)
arXiv Detail & Related papers (2023-03-23T02:16:38Z)
Trap and Replace: Defending Backdoor Attacks by Trapping Them into an Easy-to-Replace Subnetwork [105.0735256031911]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. We propose a brand-new backdoor defense strategy, which makes it much easier to remove the harmful influence of backdoor samples. We evaluate our method against ten different backdoor attacks.
arXiv Detail & Related papers (2022-10-12T17:24:01Z)
Imperceptible and Multi-channel Backdoor Attack against Deep Neural Networks [9.931056642574454]
We propose a novel imperceptible and multi-channel backdoor attack against Deep Neural Networks. Specifically, for a colored image, we utilize DCT steganography to construct the trigger on different channels of the image. Experimental results demonstrate that the average attack success rate of the N-to-N backdoor attack is 93.95% on CIFAR-10 dataset and 91.55% on TinyImageNet dataset.
arXiv Detail & Related papers (2022-01-31T12:19:28Z)
Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural Networks [24.532269628999025]
Backdoor (Trojan) attacks are emerging threats against deep neural networks (DNN) In this paper, we propose an "in-flight" defense against backdoor attacks on image classification.
arXiv Detail & Related papers (2021-12-06T20:52:00Z)
ONION: A Simple and Effective Defense Against Textual Backdoor Attacks [91.83014758036575]
Backdoor attacks are a kind of emergent training-time threat to deep neural networks (DNNs) In this paper, we propose a simple and effective textual backdoor defense named ONION. Experiments demonstrate the effectiveness of our model in defending BiLSTM and BERT against five different backdoor attacks.
arXiv Detail & Related papers (2020-11-20T12:17:21Z)
Backdoor Attacks to Graph Neural Networks [73.56867080030091]
We propose the first backdoor attack to graph neural networks (GNN) In our backdoor attack, a GNN predicts an attacker-chosen target label for a testing graph once a predefined subgraph is injected to the testing graph. Our empirical results show that our backdoor attacks are effective with a small impact on a GNN's prediction accuracy for clean testing graphs.
arXiv Detail & Related papers (2020-06-19T14:51:01Z)
BadNL: Backdoor Attacks against NLP Models with Semantic-preserving Improvements [33.309299864983295]
We propose BadNL, a general NLP backdoor attack framework including novel attack methods. Our attacks achieve an almost perfect attack success rate with a negligible effect on the original model's utility.
arXiv Detail & Related papers (2020-06-01T16:17:14Z)
Defending against Backdoor Attack on Deep Neural Networks [98.45955746226106]
We study the so-called textitbackdoor attack, which injects a backdoor trigger to a small portion of training data. Experiments show that our method could effectively decrease the attack success rate, and also hold a high classification accuracy for clean images.
arXiv Detail & Related papers (2020-02-26T02:03:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.