DeepPayload: Black-box Backdoor Attack on Deep Learning Models through
Neural Payload Injection
- URL: http://arxiv.org/abs/2101.06896v1
- Date: Mon, 18 Jan 2021 06:29:30 GMT
- Title: DeepPayload: Black-box Backdoor Attack on Deep Learning Models through
Neural Payload Injection
- Authors: Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, Yunxin Liu
- Abstract summary: We introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models.
The injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease.
We found 54 apps that were vulnerable to our attack, including popular and security-critical ones.
- Score: 17.136757440204722
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep learning models are increasingly used in mobile applications as critical
components. Unlike the program bytecode whose vulnerabilities and threats have
been widely-discussed, whether and how the deep learning models deployed in the
applications can be compromised are not well-understood since neural networks
are usually viewed as a black box. In this paper, we introduce a highly
practical backdoor attack achieved with a set of reverse-engineering techniques
over compiled deep learning models. The core of the attack is a neural
conditional branch constructed with a trigger detector and several operators
and injected into the victim model as a malicious payload. The attack is
effective as the conditional logic can be flexibly customized by the attacker,
and scalable as it does not require any prior knowledge from the original
model. We evaluated the attack effectiveness using 5 state-of-the-art deep
learning models and real-world samples collected from 30 users. The results
demonstrated that the injected backdoor can be triggered with a success rate of
93.5%, while only brought less than 2ms latency overhead and no more than 1.4%
accuracy decrease. We further conducted an empirical study on real-world mobile
deep learning apps collected from Google Play. We found 54 apps that were
vulnerable to our attack, including popular and security-critical ones. The
results call for the awareness of deep learning application developers and
auditors to enhance the protection of deployed models.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Hijacking Attacks against Neural Networks by Analyzing Training Data [21.277867143827812]
CleanSheet is a new model hijacking attack that obtains the high performance of backdoor attacks without requiring the adversary to train the model.
CleanSheet exploits vulnerabilities in tampers stemming from the training data.
Results show that CleanSheet exhibits comparable to state-of-the-art backdoor attacks, achieving an average attack success rate (ASR) of 97.5% on CIFAR-100 and 92.4% on GTSRB.
arXiv Detail & Related papers (2024-01-18T05:48:56Z) - Exploiting Machine Unlearning for Backdoor Attacks in Deep Learning
System [4.9233610638625604]
We propose a novel black-box backdoor attack based on machine unlearning.
The attacker first augments the training set with carefully designed samples, including poison and mitigation data, to train a benign' model.
Then, the attacker posts unlearning requests for the mitigation samples to remove the impact of relevant data on the model, gradually activating the hidden backdoor.
arXiv Detail & Related papers (2023-09-12T02:42:39Z) - Fault Injection and Safe-Error Attack for Extraction of Embedded Neural
Network Models [1.3654846342364308]
We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT)
We propose a black-box approach to craft a successful attack set.
For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
arXiv Detail & Related papers (2023-08-31T13:09:33Z) - Isolation and Induction: Training Robust Deep Neural Networks against
Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers.
This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses.
In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - Smart App Attack: Hacking Deep Learning Models in Android Apps [16.663345577900813]
We introduce a grey-box adversarial attack framework to hack on-device models.
We evaluate the attack effectiveness and generality in terms of four different settings.
Among 53 apps adopting transfer learning, we find that 71.7% of them can be successfully attacked.
arXiv Detail & Related papers (2022-04-23T14:01:59Z) - Backdoor Attacks on Self-Supervised Learning [22.24046752858929]
We show that self-supervised learning methods are vulnerable to backdoor attacks.
An attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images.
We propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack.
arXiv Detail & Related papers (2021-05-21T04:22:05Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.