PatchBackdoor: Backdoor Attack against Deep Neural Networks without
Model Modification
- URL: http://arxiv.org/abs/2308.11822v1
- Date: Tue, 22 Aug 2023 23:02:06 GMT
- Title: PatchBackdoor: Backdoor Attack against Deep Neural Networks without
Model Modification
- Authors: Yizhen Yuan (1), Rui Kong (3), Shenghao Xie (4), Yuanchun Li (1 and
2), Yunxin Liu (1 and 2) ((1) Institute for AI Industry Research (AIR),
Tsinghua University, Beijing, China, (2) Shanghai AI Laboratory, Shanghai,
China, (3) Shanghai Jiao Tong University, Shanghai, China, (4) Wuhan
University, Wuhan, China)
- Abstract summary: Backdoor attack is a major threat to deep learning systems in safety-critical scenarios.
In this paper, we show that backdoor attacks can be achieved without any model modification.
We implement PatchBackdoor in real-world scenarios and show that the attack is still threatening.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attack is a major threat to deep learning systems in safety-critical
scenarios, which aims to trigger misbehavior of neural network models under
attacker-controlled conditions. However, most backdoor attacks have to modify
the neural network models through training with poisoned data and/or direct
model editing, which leads to a common but false belief that backdoor attack
can be easily avoided by properly protecting the model. In this paper, we show
that backdoor attacks can be achieved without any model modification. Instead
of injecting backdoor logic into the training data or the model, we propose to
place a carefully-designed patch (namely backdoor patch) in front of the
camera, which is fed into the model together with the input images. The patch
can be trained to behave normally at most of the time, while producing wrong
prediction when the input image contains an attacker-controlled trigger object.
Our main techniques include an effective training method to generate the
backdoor patch and a digital-physical transformation modeling method to enhance
the feasibility of the patch in real deployments. Extensive experiments show
that PatchBackdoor can be applied to common deep learning models (VGG,
MobileNet, ResNet) with an attack success rate of 93% to 99% on classification
tasks. Moreover, we implement PatchBackdoor in real-world scenarios and show
that the attack is still threatening.
Related papers
- Expose Before You Defend: Unifying and Enhancing Backdoor Defenses via Exposed Models [68.40324627475499]
We introduce a novel two-step defense framework named Expose Before You Defend.
EBYD unifies existing backdoor defense methods into a comprehensive defense system with enhanced performance.
We conduct extensive experiments on 10 image attacks and 6 text attacks across 2 vision datasets and 4 language datasets.
arXiv Detail & Related papers (2024-10-25T09:36:04Z) - Exploiting the Vulnerability of Large Language Models via Defense-Aware Architectural Backdoor [0.24335447922683692]
We introduce a new type of backdoor attack that conceals itself within the underlying model architecture.
The add-on modules of model architecture layers can detect the presence of input trigger tokens and modify layer weights.
We conduct extensive experiments to evaluate our attack methods using two model architecture settings on five different large language datasets.
arXiv Detail & Related papers (2024-09-03T14:54:16Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - DECK: Model Hardening for Defending Pervasive Backdoors [21.163501644177668]
Pervasive backdoors are triggered by dynamic and pervasive input perturbations.
We develop a general pervasive attack based on an encoder-decoder architecture enhanced with a special transformation layer.
Our technique can enlarge class distances by 59.65% on average with less than 1% accuracy degradation and no loss.
arXiv Detail & Related papers (2022-06-18T19:46:06Z) - Check Your Other Door! Establishing Backdoor Attacks in the Frequency
Domain [80.24811082454367]
We show the advantages of utilizing the frequency domain for establishing undetectable and powerful backdoor attacks.
We also show two possible defences that succeed against frequency-based backdoor attacks and possible ways for the attacker to bypass them.
arXiv Detail & Related papers (2021-09-12T12:44:52Z) - Backdoor Attack in the Physical World [49.64799477792172]
Backdoor attack intends to inject hidden backdoor into the deep neural networks (DNNs)
Most existing backdoor attacks adopted the setting of static trigger, $i.e.,$ triggers across the training and testing images.
We demonstrate that this attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2021-04-06T08:37:33Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z) - Blind Backdoors in Deep Learning Models [22.844973592524966]
We investigate a new method for injecting backdoors into machine learning models, based on compromising the loss-value computation in the model-training code.
We use it to demonstrate new classes of backdoors strictly more powerful than those in the prior literature.
Our attack is blind: the attacker cannot modify the training data, nor observe the execution of his code, nor access the resulting model.
arXiv Detail & Related papers (2020-05-08T02:15:53Z) - Clean-Label Backdoor Attacks on Video Recognition Models [87.46539956587908]
We show that image backdoor attacks are far less effective on videos.
We propose the use of a universal adversarial trigger as the backdoor trigger to attack video recognition models.
Our proposed backdoor attack is resistant to state-of-the-art backdoor defense/detection methods.
arXiv Detail & Related papers (2020-03-06T04:51:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.