Related papers: Imperceptible Backdoor Attack: From Input Space to Feature Representation

Imperceptible Backdoor Attack: From Input Space to Feature Representation

URL: http://arxiv.org/abs/2205.03190v1
Date: Fri, 6 May 2022 13:02:26 GMT
Title: Imperceptible Backdoor Attack: From Input Space to Feature Representation
Authors: Nan Zhong, Zhenxing Qian, Xinpeng Zhang
Abstract summary: Backdoor attacks are rapidly emerging threats to deep neural networks (DNNs) In this paper, we analyze the drawbacks of existing attack approaches and propose a novel imperceptible backdoor attack. Our trigger only modifies less than 1% pixels of a benign image while the magnitude is 1.
Score: 24.82632240825927
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Backdoor attacks are rapidly emerging threats to deep neural networks (DNNs). In the backdoor attack scenario, attackers usually implant the backdoor into the target model by manipulating the training dataset or training process. Then, the compromised model behaves normally for benign input yet makes mistakes when the pre-defined trigger appears. In this paper, we analyze the drawbacks of existing attack approaches and propose a novel imperceptible backdoor attack. We treat the trigger pattern as a special kind of noise following a multinomial distribution. A U-net-based network is employed to generate concrete parameters of multinomial distribution for each benign input. This elaborated trigger ensures that our approach is invisible to both humans and statistical detection. Besides the design of the trigger, we also consider the robustness of our approach against model diagnose-based defences. We force the feature representation of malicious input stamped with the trigger to be entangled with the benign one. We demonstrate the effectiveness and robustness against multiple state-of-the-art defences through extensive datasets and networks. Our trigger only modifies less than 1\% pixels of a benign image while the modification magnitude is 1. Our source code is available at https://github.com/Ekko-zn/IJCAI2022-Backdoor.

Related papers

Towards Invisible Backdoor Attack on Text-to-Image Diffusion Model [70.03122709795122]
Backdoor attacks targeting text-to-image diffusion models have advanced rapidly. Current backdoor samples often exhibit two key abnormalities compared to benign samples. We propose a novel Invisible Backdoor Attack (IBA) to enhance the stealthiness of backdoor samples.
arXiv Detail & Related papers (2025-03-22T10:41:46Z)
T2IShield: Defending Against Backdoors on Text-to-Image Diffusion Models [70.03122709795122]
We propose a comprehensive defense method named T2IShield to detect, localize, and mitigate backdoor attacks. We find the "Assimilation Phenomenon" on the cross-attention maps caused by the backdoor trigger. For backdoor sample detection, T2IShield achieves a detection F1 score of 88.9$%$ with low computational cost.
arXiv Detail & Related papers (2024-07-05T01:53:21Z)
An Invisible Backdoor Attack Based On Semantic Feature [0.0]
Backdoor attacks have severely threatened deep neural network (DNN) models in the past several years. We propose a novel backdoor attack, making imperceptible changes. We evaluate our attack on three prominent image classification datasets.
arXiv Detail & Related papers (2024-05-19T13:50:40Z)
Backdoor Attack with Mode Mixture Latent Modification [26.720292228686446]
We propose a backdoor attack paradigm that only requires minimal alterations to a clean model in order to inject the backdoor under the guise of fine-tuning. We evaluate the effectiveness of our method on four popular benchmark datasets.
arXiv Detail & Related papers (2024-03-12T09:59:34Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Backdoor Attacks with Input-unique Triggers in NLP [34.98477726215485]
Backdoor attack aims at inducing neural models to make incorrect predictions for poison data while keeping predictions on the clean dataset unchanged. In this paper, we propose an input-unique backdoor attack(NURA), where we generate backdoor triggers unique to inputs.
arXiv Detail & Related papers (2023-03-25T01:41:54Z)
SATBA: An Invisible Backdoor Attack Based On Spatial Attention [7.405457329942725]
Backdoor attacks involve the training of Deep Neural Network (DNN) on datasets that contain hidden trigger patterns. Most existing backdoor attacks suffer from two significant drawbacks: their trigger patterns are visible and easy to detect by backdoor defense or even human inspection. We propose a novel backdoor attack named SATBA that overcomes these limitations using spatial attention and an U-net based model.
arXiv Detail & Related papers (2023-02-25T10:57:41Z)
Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics. We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z)
Black-box Detection of Backdoor Attacks with Limited Information and Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
Rethinking the Trigger of Backdoor Attack [83.98031510668619]
Currently, most of existing backdoor attacks adopted the setting of emphstatic trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area. We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2020-04-09T17:19:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.