SATBA: An Invisible Backdoor Attack Based On Spatial Attention
- URL: http://arxiv.org/abs/2302.13056v3
- Date: Tue, 5 Mar 2024 05:10:04 GMT
- Title: SATBA: An Invisible Backdoor Attack Based On Spatial Attention
- Authors: Huasong Zhou, Xiaowei Xu, Xiaodong Wang, and Leon Bevan Bullock
- Abstract summary: Backdoor attacks involve the training of Deep Neural Network (DNN) on datasets that contain hidden trigger patterns.
Most existing backdoor attacks suffer from two significant drawbacks: their trigger patterns are visible and easy to detect by backdoor defense or even human inspection.
We propose a novel backdoor attack named SATBA that overcomes these limitations using spatial attention and an U-net based model.
- Score: 7.405457329942725
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attack has emerged as a novel and concerning threat to AI security.
These attacks involve the training of Deep Neural Network (DNN) on datasets
that contain hidden trigger patterns. Although the poisoned model behaves
normally on benign samples, it exhibits abnormal behavior on samples containing
the trigger pattern. However, most existing backdoor attacks suffer from two
significant drawbacks: their trigger patterns are visible and easy to detect by
backdoor defense or even human inspection, and their injection process results
in the loss of natural sample features and trigger patterns, thereby reducing
the attack success rate and model accuracy. In this paper, we propose a novel
backdoor attack named SATBA that overcomes these limitations using spatial
attention and an U-net based model. The attack process begins by using spatial
attention to extract meaningful data features and generate trigger patterns
associated with clean images. Then, an U-shaped model is used to embed these
trigger patterns into the original data without causing noticeable feature
loss. We evaluate our attack on three prominent image classification DNN across
three standard datasets. The results demonstrate that SATBA achieves high
attack success rate while maintaining robustness against backdoor defenses.
Furthermore, we conduct extensive image similarity experiments to emphasize the
stealthiness of our attack strategy. Overall, SATBA presents a promising
approach to backdoor attack, addressing the shortcomings of previous methods
and showcasing its effectiveness in evading detection and maintaining high
attack success rate.
Related papers
- An Invisible Backdoor Attack Based On Semantic Feature [0.0]
Backdoor attacks have severely threatened deep neural network (DNN) models in the past several years.
We propose a novel backdoor attack, making imperceptible changes.
We evaluate our attack on three prominent image classification datasets.
arXiv Detail & Related papers (2024-05-19T13:50:40Z) - Backdoor Attack against One-Class Sequential Anomaly Detection Models [10.020488631167204]
We explore compromising deep sequential anomaly detection models by proposing a novel backdoor attack strategy.
The attack approach comprises two primary steps, trigger generation and backdoor injection.
Experiments demonstrate the effectiveness of our proposed attack strategy by injecting backdoors on two well-established one-class anomaly detection models.
arXiv Detail & Related papers (2024-02-15T19:19:54Z) - Attention-Enhancing Backdoor Attacks Against BERT-based Models [54.070555070629105]
Investigating the strategies of backdoor attacks will help to understand the model's vulnerability.
We propose a novel Trojan Attention Loss (TAL) which enhances the Trojan behavior by directly manipulating the attention patterns.
arXiv Detail & Related papers (2023-10-23T01:24:56Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - BATT: Backdoor Attack with Transformation-based Triggers [72.61840273364311]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
Backdoor adversaries inject hidden backdoors that can be activated by adversary-specified trigger patterns.
One recent research revealed that most of the existing attacks failed in the real physical world.
arXiv Detail & Related papers (2022-11-02T16:03:43Z) - Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain [8.64369418938889]
We propose a generalized backdoor attack method based on the frequency domain.
It can implement backdoor implantation without mislabeling and accessing the training process.
We evaluate our approach in the no-label and clean-label cases on three datasets.
arXiv Detail & Related papers (2022-07-09T07:05:53Z) - Rethinking the Trigger of Backdoor Attack [83.98031510668619]
Currently, most of existing backdoor attacks adopted the setting of emphstatic trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area.
We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2020-04-09T17:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.