Quantization Backdoors to Deep Learning Commercial Frameworks
- URL: http://arxiv.org/abs/2108.09187v3
- Date: Thu, 27 Apr 2023 06:08:27 GMT
- Title: Quantization Backdoors to Deep Learning Commercial Frameworks
- Authors: Hua Ma, Huming Qiu, Yansong Gao, Zhi Zhang, Alsharif Abuadbba, Minhui
Xue, Anmin Fu, Zhang Jiliang, Said Al-Sarawi, Derek Abbott
- Abstract summary: We show that the standard quantization toolkits can be abused to activate a backdoor.
This work highlights that a stealthy security threat occurs when an end user utilizes the on-device post-training model quantization frameworks.
- Score: 16.28615808834053
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Currently, there is a burgeoning demand for deploying deep learning (DL)
models on ubiquitous edge Internet of Things (IoT) devices attributed to their
low latency and high privacy preservation. However, DL models are often large
in size and require large-scale computation, which prevents them from being
placed directly onto IoT devices, where resources are constrained and 32-bit
floating-point (float-32) operations are unavailable. Commercial framework
(i.e., a set of toolkits) empowered model quantization is a pragmatic solution
that enables DL deployment on mobile devices and embedded systems by
effortlessly post-quantizing a large high-precision model (e.g., float-32) into
a small low-precision model (e.g., int-8) while retaining the model inference
accuracy. However, their usability might be threatened by security
vulnerabilities.
This work reveals that the standard quantization toolkits can be abused to
activate a backdoor. We demonstrate that a full-precision backdoored model
which does not have any backdoor effect in the presence of a trigger -- as the
backdoor is dormant -- can be activated by the default i) TensorFlow-Lite
(TFLite) quantization, the only product-ready quantization framework to date,
and ii) the beta released PyTorch Mobile framework. When each of the float-32
models is converted into an int-8 format model through the standard TFLite or
Pytorch Mobile framework's post-training quantization, the backdoor is
activated in the quantized model, which shows a stable attack success rate
close to 100% upon inputs with the trigger, while it behaves normally upon
non-trigger inputs. This work highlights that a stealthy security threat occurs
when an end user utilizes the on-device post-training model quantization
frameworks, informing security researchers of cross-platform overhaul of DL
models post quantization even if these models pass front-end backdoor
inspections.
Related papers
- Low-bit Model Quantization for Deep Neural Networks: A Survey [123.89598730307208]
This article surveys the recent five-year progress towards low-bit quantization on deep neural networks (DNNs)<n>We discuss and compare the state-of-the-art quantization methods and classify them into 8 main categories and 24 sub-categories according to their core techniques.<n>We shed light on the potential research opportunities in the field of model quantization.
arXiv Detail & Related papers (2025-05-08T13:26:19Z) - ASPIRER: Bypassing System Prompts With Permutation-based Backdoors in LLMs [17.853862145962292]
We introduce a novel backdoor attack that systematically bypasses system prompts.
Our method achieves an attack success rate (ASR) of up to 99.50% while maintaining a clean accuracy (CACC) of 98.58%.
arXiv Detail & Related papers (2024-10-05T02:58:20Z) - Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack.
When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model.
Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z) - Effective Backdoor Mitigation in Vision-Language Models Depends on the Pre-training Objective [71.39995120597999]
Modern machine learning models are vulnerable to adversarial and backdoor attacks.
Such risks are heightened by the prevalent practice of collecting massive, internet-sourced datasets for training multimodal models.
CleanCLIP is the current state-of-the-art approach to mitigate the effects of backdooring in multimodal models.
arXiv Detail & Related papers (2023-11-25T06:55:13Z) - Watermarking LLMs with Weight Quantization [61.63899115699713]
This paper proposes a novel watermarking strategy that plants watermarks in the quantization process of large language models.
We successfully plant the watermark into open-source large language model weights including GPT-Neo and LLaMA.
arXiv Detail & Related papers (2023-10-17T13:06:59Z) - Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models [1.2499537119440245]
We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT)
We propose a black-box approach to craft a successful attack set.
For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
arXiv Detail & Related papers (2023-08-31T13:09:33Z) - One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training [54.622474306336635]
A new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques.
We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release.
arXiv Detail & Related papers (2023-08-12T09:34:43Z) - QuMoS: A Framework for Preserving Security of Quantum Machine Learning
Model [10.543277412560233]
Security has always been a critical issue in machine learning (ML) applications.
Model-stealing attack is one of the most fundamental but vitally important issues.
We propose a novel framework, namely QuMoS, to preserve model security.
arXiv Detail & Related papers (2023-04-23T01:17:43Z) - Publishing Efficient On-device Models Increases Adversarial
Vulnerability [58.6975494957865]
In this paper, we study the security considerations of publishing on-device variants of large-scale models.
We first show that an adversary can exploit on-device models to make attacking the large models easier.
We then show that the vulnerability increases as the similarity between a full-scale and its efficient model increase.
arXiv Detail & Related papers (2022-12-28T05:05:58Z) - Backdoor Attacks on Crowd Counting [63.90533357815404]
Crowd counting is a regression task that estimates the number of people in a scene image.
In this paper, we investigate the vulnerability of deep learning based crowd counting models to backdoor attacks.
arXiv Detail & Related papers (2022-07-12T16:17:01Z) - DeepSight: Mitigating Backdoor Attacks in Federated Learning Through
Deep Model Inspection [26.593268413299228]
Federated Learning (FL) allows multiple clients to collaboratively train a Neural Network (NN) model on their private data without revealing the data.
DeepSight is a novel model filtering approach for mitigating backdoor attacks.
We show that it can mitigate state-of-the-art backdoor attacks with a negligible impact on the model's performance on benign data.
arXiv Detail & Related papers (2022-01-03T17:10:07Z) - Qu-ANTI-zation: Exploiting Quantization Artifacts for Achieving
Adversarial Outcomes [5.865029600972316]
Quantization is a technique that transforms the parameter representation of a neural network from floating-point numbers into lower-precision ones.
We propose a new training framework to implement adversarial quantization outcomes.
We show that a single compromised model defeats multiple quantization schemes.
arXiv Detail & Related papers (2021-10-26T10:09:49Z) - Model Watermarking for Image Processing Networks [120.918532981871]
How to protect the intellectual property of deep models is a very important but seriously under-researched problem.
We propose the first model watermarking framework for protecting image processing models.
arXiv Detail & Related papers (2020-02-25T18:36:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.