One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
- URL: http://arxiv.org/abs/2308.07934v1
- Date: Sat, 12 Aug 2023 09:34:43 GMT
- Title: One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
- Authors: Jianshuo Dong, Han Qiu, Yiming Li, Tianwei Zhang, Yuanjie Li, Zeqi
Lai, Chao Zhang, Shu-Tao Xia
- Abstract summary: A new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques.
We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release.
- Score: 54.622474306336635
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks (DNNs) are widely deployed on real-world devices.
Concerns regarding their security have gained great attention from researchers.
Recently, a new weight modification attack called bit flip attack (BFA) was
proposed, which exploits memory fault inject techniques such as row hammer to
attack quantized models in the deployment stage. With only a few bit flips, the
target model can be rendered useless as a random guesser or even be implanted
with malicious functionalities. In this work, we seek to further reduce the
number of bit flips. We propose a training-assisted bit flip attack, in which
the adversary is involved in the training stage to build a high-risk model to
release. This high-risk model, obtained coupled with a corresponding malicious
model, behaves normally and can escape various detection methods. The results
on benchmark datasets show that an adversary can easily convert this high-risk
but normal model to a malicious one on victim's side by \textbf{flipping only
one critical bit} on average in the deployment stage. Moreover, our attack
still poses a significant threat even when defenses are employed. The codes for
reproducing main experiments are available at
\url{https://github.com/jianshuod/TBA}.
Related papers
- A Realistic Threat Model for Large Language Model Jailbreaks [87.64278063236847]
In this work, we propose a unified threat model for the principled comparison of jailbreak attacks.
Our threat model combines constraints in perplexity, measuring how far a jailbreak deviates from natural text.
We adapt popular attacks to this new, realistic threat model, with which we, for the first time, benchmark these attacks on equal footing.
arXiv Detail & Related papers (2024-10-21T17:27:01Z) - DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks [4.734824660843964]
We introduce an encoding-based protection method against bit-flip attacks on neural networks, titled DeepNcode.
Our results show an increase in protection margin of up to $7.6times$ for $4-$bit and $12.4times$ for $8-$bit quantized networks.
arXiv Detail & Related papers (2024-05-22T18:01:34Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Disarming Steganography Attacks Inside Neural Network Models [4.750077838548593]
We propose a zero-trust prevention strategy based on AI model attack disarm and reconstruction.
We demonstrate a 100% prevention rate while the methods introduce a minimal decrease in model accuracy based on Qint8 and K-LRBP methods.
arXiv Detail & Related papers (2023-09-06T15:18:35Z) - Fault Injection and Safe-Error Attack for Extraction of Embedded Neural
Network Models [1.3654846342364308]
We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT)
We propose a black-box approach to craft a successful attack set.
For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
arXiv Detail & Related papers (2023-08-31T13:09:33Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z) - Practical No-box Adversarial Attacks against DNNs [31.808770437120536]
We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model.
We propose three mechanisms for training with a very small dataset and find that prototypical reconstruction is the most effective.
Our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.
arXiv Detail & Related papers (2020-12-04T11:10:03Z) - Adversarial examples are useful too! [47.64219291655723]
I propose a new method to tell whether a model has been subject to a backdoor attack.
The idea is to generate adversarial examples, targeted or untargeted, using conventional attacks such as FGSM.
It is possible to visually locate the perturbed regions and unveil the attack.
arXiv Detail & Related papers (2020-05-13T01:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.