SNIFF: Reverse Engineering of Neural Networks with Fault Attacks
- URL: http://arxiv.org/abs/2002.11021v1
- Date: Sun, 23 Feb 2020 05:39:54 GMT
- Title: SNIFF: Reverse Engineering of Neural Networks with Fault Attacks
- Authors: Jakub Breier, Dirmanto Jap, Xiaolu Hou, Shivam Bhasin, Yang Liu
- Abstract summary: We explore the possibility to reverse engineer neural networks with the usage of fault attacks.
SNIFF stands for sign bit flip fault, which enables the reverse engineering by changing the sign of intermediate values.
We develop the first exact extraction method on deep-layer feature extractor networks that provably allows the recovery of the model parameters.
- Score: 26.542434084399265
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks have been shown to be vulnerable against fault injection
attacks. These attacks change the physical behavior of the device during the
computation, resulting in a change of value that is currently being computed.
They can be realized by various fault injection techniques, ranging from
clock/voltage glitching to application of lasers to rowhammer. In this paper we
explore the possibility to reverse engineer neural networks with the usage of
fault attacks. SNIFF stands for sign bit flip fault, which enables the reverse
engineering by changing the sign of intermediate values. We develop the first
exact extraction method on deep-layer feature extractor networks that provably
allows the recovery of the model parameters. Our experiments with Keras library
show that the precision error for the parameter recovery for the tested
networks is less than $10^{-13}$ with the usage of 64-bit floats, which
improves the current state of the art by 6 orders of magnitude. Additionally,
we discuss the protection techniques against fault injection attacks that can
be applied to enhance the fault resistance.
Related papers
- Augmented Neural Fine-Tuning for Efficient Backdoor Purification [16.74156528484354]
Recent studies have revealed the vulnerability of deep neural networks (DNNs) to various backdoor attacks.
We propose Neural mask Fine-Tuning (NFT) with an aim to optimally re-organize the neuron activities.
NFT relaxes the trigger synthesis process and eliminates the requirement of the adversarial search module.
arXiv Detail & Related papers (2024-07-14T02:36:54Z) - DeepNcode: Encoding-Based Protection against Bit-Flip Attacks on Neural Networks [4.734824660843964]
We introduce an encoding-based protection method against bit-flip attacks on neural networks, titled DeepNcode.
Our results show an increase in protection margin of up to $7.6times$ for $4-$bit and $12.4times$ for $8-$bit quantized networks.
arXiv Detail & Related papers (2024-05-22T18:01:34Z) - Rethinking PGD Attack: Is Sign Function Necessary? [131.6894310945647]
We present a theoretical analysis of how such sign-based update algorithm influences step-wise attack performance.
We propose a new raw gradient descent (RGD) algorithm that eliminates the use of sign.
The effectiveness of the proposed RGD algorithm has been demonstrated extensively in experiments.
arXiv Detail & Related papers (2023-12-03T02:26:58Z) - NeuralFuse: Learning to Recover the Accuracy of Access-Limited Neural
Network Inference in Low-Voltage Regimes [52.51014498593644]
Deep neural networks (DNNs) have become ubiquitous in machine learning, but their energy consumption remains a notable issue.
We introduce NeuralFuse, a novel add-on module that addresses the accuracy-energy tradeoff in low-voltage regimes.
At a 1% bit error rate, NeuralFuse can reduce memory access energy by up to 24% while recovering accuracy by up to 57%.
arXiv Detail & Related papers (2023-06-29T11:38:22Z) - CorrectNet: Robustness Enhancement of Analog In-Memory Computing for
Neural Networks by Error Suppression and Compensation [4.570841222958966]
We propose a framework to enhance the robustness of neural networks under variations and noise.
We show that inference accuracy of neural networks can be recovered from as low as 1.69% under variations and noise.
arXiv Detail & Related papers (2022-11-27T19:13:33Z) - Don't Knock! Rowhammer at the Backdoor of DNN Models [19.13129153353046]
We present an end-to-end backdoor injection attack realized on actual hardware on a model using Rowhammer as the fault injection method.
We propose a novel network training algorithm based on constrained optimization to achieve a realistic backdoor injection attack in hardware.
arXiv Detail & Related papers (2021-10-14T19:43:53Z) - Using Undervolting as an On-Device Defense Against Adversarial Machine
Learning Attacks [1.9212368803706579]
We propose a novel, lightweight adversarial correction and/or detection mechanism for image classifiers.
We show that these errors disrupt the adversarial input in a way that can be used either to correct the classification or detect the input as adversarial.
arXiv Detail & Related papers (2021-07-20T23:21:04Z) - Targeted Attack against Deep Neural Networks via Flipping Limited Weight
Bits [55.740716446995805]
We study a novel attack paradigm, which modifies model parameters in the deployment stage for malicious purposes.
Our goal is to misclassify a specific sample into a target class without any sample modification.
By utilizing the latest technique in integer programming, we equivalently reformulate this BIP problem as a continuous optimization problem.
arXiv Detail & Related papers (2021-02-21T03:13:27Z) - Defence against adversarial attacks using classical and quantum-enhanced
Boltzmann machines [64.62510681492994]
generative models attempt to learn the distribution underlying a dataset, making them inherently more robust to small perturbations.
We find improvements ranging from 5% to 72% against attacks with Boltzmann machines on the MNIST dataset.
arXiv Detail & Related papers (2020-12-21T19:00:03Z) - Cassandra: Detecting Trojaned Networks from Adversarial Perturbations [92.43879594465422]
In many cases, pre-trained models are sourced from vendors who may have disrupted the training pipeline to insert Trojan behaviors into the models.
We propose a method to verify if a pre-trained model is Trojaned or benign.
Our method captures fingerprints of neural networks in the form of adversarial perturbations learned from the network gradients.
arXiv Detail & Related papers (2020-07-28T19:00:40Z) - Scalable Backdoor Detection in Neural Networks [61.39635364047679]
Deep learning models are vulnerable to Trojan attacks, where an attacker can install a backdoor during training time to make the resultant model misidentify samples contaminated with a small trigger patch.
We propose a novel trigger reverse-engineering based approach whose computational complexity does not scale with the number of labels, and is based on a measure that is both interpretable and universal across different network and patch types.
In experiments, we observe that our method achieves a perfect score in separating Trojaned models from pure models, which is an improvement over the current state-of-the art method.
arXiv Detail & Related papers (2020-06-10T04:12:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.