Related papers: Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models

Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models

URL: http://arxiv.org/abs/2308.16703v2
Date: Fri, 15 Nov 2024 14:20:32 GMT
Title: Fault Injection and Safe-Error Attack for Extraction of Embedded Neural Network Models
Authors: Kevin Hector, Pierre-Alain Moellic, Mathieu Dumont, Jean-Max Dutertre,
Abstract summary: We focus on embedded deep neural network models on 32-bit microcontrollers in the Internet of Things (IoT) We propose a black-box approach to craft a successful attack set. For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs.
Score: 1.2499537119440245
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Model extraction emerges as a critical security threat with attack vectors exploiting both algorithmic and implementation-based approaches. The main goal of an attacker is to steal as much information as possible about a protected victim model, so that he can mimic it with a substitute model, even with a limited access to similar training data. Recently, physical attacks such as fault injection have shown worrying efficiency against the integrity and confidentiality of embedded models. We focus on embedded deep neural network models on 32-bit microcontrollers, a widespread family of hardware platforms in IoT, and the use of a standard fault injection strategy - Safe Error Attack (SEA) - to perform a model extraction attack with an adversary having a limited access to training data. Since the attack strongly depends on the input queries, we propose a black-box approach to craft a successful attack set. For a classical convolutional neural network, we successfully recover at least 90% of the most significant bits with about 1500 crafted inputs. These information enable to efficiently train a substitute model, with only 8% of the training dataset, that reaches high fidelity and near identical accuracy level than the victim model.

Related papers

Protecting the Neural Networks against FGSM Attack Using Machine Unlearning [1.0832844764942349]
We focus on applying unlearning techniques to the LeNet neural network, a popular architecture for image classification.<n>We evaluate the efficacy of unlearning FGSM attacks on the LeNet network and find that it can significantly improve its robustness against these types of attacks.
arXiv Detail & Related papers (2025-11-03T09:21:49Z)
No Query, No Access [50.18709429731724]
We introduce the textbfVictim Data-based Adrial Attack (VDBA), which operates using only victim texts.<n>To prevent access to the victim model, we create a shadow dataset with publicly available pre-trained models and clustering methods.<n>Experiments on the Emotion and SST5 datasets show that VDBA outperforms state-of-the-art methods, achieving an ASR improvement of 52.08%.
arXiv Detail & Related papers (2025-05-12T06:19:59Z)
A Practical Trigger-Free Backdoor Attack on Neural Networks [33.426207982772226]
We propose a trigger-free backdoor attack that does not require access to any training data. Specifically, we design a novel fine-tuning approach that incorporates the concept of malicious data into the concept of the attacker-specified class. The effectiveness, practicality, and stealthiness of the proposed attack are evaluated on three real-world datasets.
arXiv Detail & Related papers (2024-08-21T08:53:36Z)
One-bit Flip is All You Need: When Bit-flip Attack Meets Model Training [54.622474306336635]
A new weight modification attack called bit flip attack (BFA) was proposed, which exploits memory fault inject techniques. We propose a training-assisted bit flip attack, in which the adversary is involved in the training stage to build a high-risk model to release.
arXiv Detail & Related papers (2023-08-12T09:34:43Z)
Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks [51.51023951695014]
Existing model stealing defenses add deceptive perturbations to the victim's posterior probabilities to mislead the attackers. This paper proposes Isolation and Induction (InI), a novel and effective training framework for model stealing defenses. In contrast to adding perturbations over model predictions that harm the benign accuracy, we train models to produce uninformative outputs against stealing queries.
arXiv Detail & Related papers (2023-08-02T05:54:01Z)
Boosting Model Inversion Attacks with Adversarial Examples [26.904051413441316]
We propose a new training paradigm for a learning-based model inversion attack that can achieve higher attack accuracy in a black-box setting. First, we regularize the training process of the attack model with an added semantic loss function. Second, we inject adversarial examples into the training data to increase the diversity of the class-related parts.
arXiv Detail & Related papers (2023-06-24T13:40:58Z)
Careful What You Wish For: on the Extraction of Adversarially Trained Models [2.707154152696381]
Recent attacks on Machine Learning (ML) models pose several security and privacy threats. We propose a framework to assess extraction attacks on adversarially trained models. We show that adversarially trained models are more vulnerable to extraction attacks than models obtained under natural training circumstances.
arXiv Detail & Related papers (2022-07-21T16:04:37Z)
Are Your Sensitive Attributes Private? Novel Model Inversion Attribute Inference Attacks on Classification Models [22.569705869469814]
We focus on model inversion attacks where the adversary knows non-sensitive attributes about records in the training data. We devise a novel confidence score-based model inversion attribute inference attack that significantly outperforms the state-of-the-art. We also extend our attacks to the scenario where some of the other (non-sensitive) attributes of a target record are unknown to the adversary.
arXiv Detail & Related papers (2022-01-23T21:27:20Z)
Delving into Data: Effectively Substitute Training for Black-box Attack [84.85798059317963]
We propose a novel perspective substitute training that focuses on designing the distribution of data used in the knowledge stealing process. The combination of these two modules can further boost the consistency of the substitute model and target model, which greatly improves the effectiveness of adversarial attack.
arXiv Detail & Related papers (2021-04-26T07:26:29Z)
DeepPayload: Black-box Backdoor Attack on Deep Learning Models through Neural Payload Injection [17.136757440204722]
We introduce a highly practical backdoor attack achieved with a set of reverse-engineering techniques over compiled deep learning models. The injected backdoor can be triggered with a success rate of 93.5%, while only brought less than 2ms latency overhead and no more than 1.4% accuracy decrease. We found 54 apps that were vulnerable to our attack, including popular and security-critical ones.
arXiv Detail & Related papers (2021-01-18T06:29:30Z)
Practical No-box Adversarial Attacks against DNNs [31.808770437120536]
We investigate no-box adversarial examples, where the attacker can neither access the model information or the training set nor query the model. We propose three mechanisms for training with a very small dataset and find that prototypical reconstruction is the most effective. Our approach significantly diminishes the average prediction accuracy of the system to only 15.40%, which is on par with the attack that transfers adversarial examples from a pre-trained Arcface model.
arXiv Detail & Related papers (2020-12-04T11:10:03Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples. We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)
DaST: Data-free Substitute Training for Adversarial Attacks [55.76371274622313]
We propose a data-free substitute training method (DaST) to obtain substitute models for adversarial black-box attacks. To achieve this, DaST utilizes specially designed generative adversarial networks (GANs) to train the substitute models. Experiments demonstrate the substitute models can achieve competitive performance compared with the baseline models.
arXiv Detail & Related papers (2020-03-28T04:28:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.