Related papers: DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection

DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection

URL: http://arxiv.org/abs/2101.06896v1
Date: Mon, 18 Jan 2021 06:29:30 GMT
Title: DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection
Authors: Yuanchun Li, Jiayi Hua, Haoyu Wang, Chunyang Chen, Yunxin Liu
Abstract summary: We introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models. The injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease. We found 54 apps that were vulnerable to our attack, including popular and security-critical ones.
Score: 17.136757440204722
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep learning models are increasingly used in mobile applications as critical components. Unlike the program bytecode whose vulnerabilities and threats have been widely-discussed, whether and how the deep learning models deployed in the applications can be compromised are not well-understood since neural networks are usually viewed as a black box. In this paper, we introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models. The core of the attack is a neural conditional branch constructed with a trigger detector and several operators and injected into the victim model as a malicious payload. The attack is effective as the conditional logic can be flexibly customized by the attacker, and scalable as it does not require any prior knowledge from the original model. We evaluated the attack effectiveness using 5 state-of-the-art deep learning models and real-world samples collected from 30 users. The results demonstrated that the injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease. We further conducted an empirical study on real-world mobile deep learning apps collected from Google Play. We found 54 apps that were vulnerable to our attack, including popular and security-critical ones. The results call for the awareness of deep learning application developers and auditors to enhance the protection of deployed models.

Related papers

Stealthy Backdoor Attack to Real-world Models in Android Apps [15.804580400017564]
We investigate more effective and stealthy backdoor attacks on real-world deep learning models extracted from mobile apps. We first confirm the effectiveness of steganography-based backdoor attacks on four state-of-the-art DNN models. Our approach achieves an average of 12.50% higher attack success rate than DeepPayload.
arXiv Detail & Related papers (2025-01-02T13:58:05Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
Hijacking Attacks against Neural Networks by Analyzing Training Data [21.277867143827812]
CleanSheet is a new model hijacking attack that obtains the high performance of backdoor attacks without requiring the adversary to train the model. CleanSheet exploits vulnerabilities in tampers stemming from the training data. Results show that CleanSheet exhibits comparable to state-of-the-art backdoor attacks, achieving an average attack success rate (ASR) of 97.5% on CIFAR-100 and 92.4% on GTSRB.
arXiv Detail & Related papers (2024-01-18T05:48:56Z)
Exploiting Machine Unlearning for Backdoor Attacks in Deep Learning System [4.9233610638625604]
We propose a novel black-box backdoor attack based on machine unlearning. The attacker first augments the training set with carefully designed samples, including poison and mitigation data, to train a benign' model. Then, the attacker posts unlearning requests for the mitigation samples to remove the impact of relevant data on the model, gradually activating the hidden backdoor.
arXiv Detail & Related papers (2023-09-12T02:42:39Z)
Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models [1.2499537119440245]
We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT) We propose a black-box approach to craft a successful attack set. For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
arXiv Detail & Related papers (2023-08-31T13:09:33Z)
Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses. In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics. We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z)
Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure. We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z)
Smart App Attack: Hacking Deep Learning Models in Android Apps [16.663345577900813]
We introduce a grey-box adversarial attack framework to hack on-device models. We evaluate the attack effectiveness and generality in terms of four different settings. Among 53 apps adopting transfer learning, we find that 71.7% of them can be successfully attacked.
arXiv Detail & Related papers (2022-04-23T14:01:59Z)
Backdoor Attacks on Self-Supervised Learning [22.24046752858929]
We show that self-supervised learning methods are vulnerable to backdoor attacks. An attacker poisons a part of the unlabeled data by adding a small trigger (known to the attacker) to the images. We propose a knowledge distillation based defense algorithm that succeeds in neutralizing the attack.
arXiv Detail & Related papers (2021-05-21T04:22:05Z)
Black-box Detection of Backdoor Attacks with Limited Information and Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.