Related papers: One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training

One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training

URL: http://arxiv.org/abs/2308.07934v1
Date: Sat, 12 Aug 2023 09:34:43 GMT
Title: One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training
Authors: Jianshuo Dong, Han Qiu, Yiming Li, Tianwei Zhang, Yuanjie Li, Zeqi Lai, Chao Zhang, Shu-Tao Xia
Abstract summary: A new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques. We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release.
Score: 54.622474306336635
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep neural networks (DNNs) are widely deployed on real-world devices. Concerns regarding their security have gained great attention from researchers. Recently, a new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques such as row hammer to attack quantized models in the deployment stage. With only a few bit flips, the target model can be rendered useless as a random guesser or even be implanted with malicious functionalities. In this work, we seek to further reduce the number of bit flips. We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release. This high-risk model, obtained coupled with a corresponding malicious model, behaves normally and can escape various detection methods. The results on benchmark datasets show that an adversary can easily convert this high-risk but normal model to a malicious one on victim's side by \textbf{flipping only one critical bit} on average in the deployment stage. Moreover, our attack still poses a significant threat even when defenses are employed. The codes for reproducing main experiments are available at \url{https://github.com/jianshuod/TBA}.

Related papers

ObfusBFA: A Holistic Approach to Safeguarding DNNs from Different Types of Bit-Flip Attacks [12.96840649714218]
Bit-flip attacks (BFAs) represent a serious threat to Deep Neural Networks (DNNs)<n>We propose ObfusBFA, an efficient and holistic methodology to mitigate BFAs.<n>We design novel algorithms to identify critical bits and insert obfuscation operations.
arXiv Detail & Related papers (2025-06-12T14:31:27Z)
Rubber Mallet: A Study of High Frequency Localized Bit Flips and Their Impact on Security [6.177931523699345]
The density of modern DRAM has heightened its vulnerability to Rowhammer attacks, which induce bit flips by repeatedly accessing specific memory rows.<n>This paper presents an analysis of bit flip patterns generated by advanced Rowhammer techniques that bypass existing hardware defenses.
arXiv Detail & Related papers (2025-05-02T18:07:07Z)
A Realistic Threat Model for Large Language Model Jailbreaks [87.64278063236847]
In this work, we propose a unified threat model for the principled comparison of jailbreak attacks. Our threat model combines constraints in perplexity, measuring how far a jailbreak deviates from natural text. We adapt popular attacks to this new, realistic threat model, with which we, for the first time, benchmark these attacks on equal footing.
arXiv Detail & Related papers (2024-10-21T17:27:01Z)
DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks [4.734824660843964]
We introduce an encoding-based protection method against bit-flip attacks on neural networks, titled DeepNcode. Our results show an increase in protection margin of up to $7.6times$ for $4-$bit and $12.4times$ for $8-$bit quantized networks.
arXiv Detail & Related papers (2024-05-22T18:01:34Z)
Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features. backdoor attacks subtly embed malicious behaviors within the model during training. We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z)
Disarming Steganography Attacks Inside Neural Network Models [4.750077838548593]
We propose a zero-trust prevention strategy based on AI model attack disarm and reconstruction. We demonstrate a 100% prevention rate while the methods introduce a minimal decrease in model accuracy based on Qint8 and K-LRBP methods.
arXiv Detail & Related papers (2023-09-06T15:18:35Z)
Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models [1.2499537119440245]
We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT) We propose a black-box approach to craft a successful attack set. For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
arXiv Detail & Related papers (2023-08-31T13:09:33Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
Practical No-box Adversarial Attacks against DNNs [31.808770437120536]
We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model. We propose three mechanisms for training with a very small dataset and find that prototypical reconstruction is the most effective. Our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.
arXiv Detail & Related papers (2020-12-04T11:10:03Z)
Adversarial examples are useful too! [47.64219291655723]
I propose a new method to tell whether a model has been subject to a backdoor attack. The idea is to generate adversarial examples, targeted or untargeted, using conventional attacks such as FGSM. It is possible to visually locate the perturbed regions and unveil the attack.
arXiv Detail & Related papers (2020-05-13T01:38:56Z)
Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models. Current substitute attacks need pre-trained models to generate adversarial examples. In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.