Related papers: Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs

Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs

URL: http://arxiv.org/abs/2012.07037v1
Date: Sun, 13 Dec 2020 11:16:25 GMT
Title: Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs
Authors: Michael Beyer, Andrey Morozov, Emil Valiev, Christoph Schorn, Lydia Gauerhof, Kai Ding, Klaus Janschek
Abstract summary: We introduce two new Fault Injection (FI) frameworks for evaluating how Deep Learning (DL) components operate under the presence of random faults. In this paper, we present the results of FI experiments conducted on four VGG-based Convolutional NNs using two image sets. Results help to identify the most critical operations and layers, compare the reliability characteristics of functionally similar NNs, and introduce selective fault tolerance mechanisms.
Score: 4.854070123523902
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Today, Deep Learning (DL) enhances almost every industrial sector, including safety-critical areas. The next generation of safety standards will define appropriate verification techniques for DL-based applications and propose adequate fault tolerance mechanisms. DL-based applications, like any other software, are susceptible to common random hardware faults such as bit flips, which occur in RAM and CPU registers. Such faults can lead to silent data corruption. Therefore, it is crucial to develop methods and tools that help to evaluate how DL components operate under the presence of such faults. In this paper, we introduce two new Fault Injection (FI) frameworks InjectTF and InjectTF2 for TensorFlow 1 and TensorFlow 2, respectively. Both frameworks are available on GitHub and allow the configurable injection of random faults into Neural Networks (NN). In order to demonstrate the feasibility of the frameworks, we also present the results of FI experiments conducted on four VGG-based Convolutional NNs using two image sets. The results demonstrate how random bit flips in the output of particular mathematical operations and layers of NNs affect the classification accuracy. These results help to identify the most critical operations and layers, compare the reliability characteristics of functionally similar NNs, and introduce selective fault tolerance mechanisms.

Related papers

Fault injection analysis of Real NVP normalising flow model for satellite anomaly detection [0.22369578015657962]
Satellites are used for a multitude of applications, including communications, Earth observation, and space science. One critical application of Artificial Intelligence (AI) is fault detection. Despite the advantages of neural networks, these systems are vulnerable to radiation errors, which can significantly impact their reliability.
arXiv Detail & Related papers (2025-04-02T08:32:59Z)
Evaluating Single Event Upsets in Deep Neural Networks for Semantic Segmentation: an embedded system perspective [1.474723404975345]
This paper delves into the robustness assessment in embedded Deep Neural Networks (DNNs) By scrutinizing the layer-by-layer and bit-by-bit sensitivity of various encoder-decoder models to soft errors, this study thoroughly investigates the vulnerability of segmentation DNNs to SEUs. We propose a set of practical lightweight error mitigation techniques with no memory or computational cost suitable for resource-constrained deployments.
arXiv Detail & Related papers (2024-12-04T18:28:38Z)
Exact Certification of (Graph) Neural Networks Against Label Poisoning [50.87615167799367]
We introduce an exact certification method for label flipping in Graph Neural Networks (GNNs) We apply our method to certify a broad range of GNN architectures in node classification tasks. Our work presents the first exact certificate to a poisoning attack ever derived for neural networks.
arXiv Detail & Related papers (2024-11-30T17:05:12Z)
DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications. BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks. We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z)
Transformer Neural Autoregressive Flows [48.68932811531102]
Density estimation can be performed using Normalizing Flows (NFs) We propose a novel solution by exploiting transformers to define a new class of neural flows called Transformer Neural Autoregressive Flows (T-NAFs)
arXiv Detail & Related papers (2024-01-03T17:51:16Z)
Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO. We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits. Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z)
Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency [1.7205106391379026]
We introduce a novel fault injection framework called PyTorchALFI (Application Level Fault Injection for PyTorch) based on PyTorchFI. PyTorchALFI provides an efficient way to define randomly generated and reusable sets of faults to inject into PyTorch models.
arXiv Detail & Related papers (2023-10-30T11:18:35Z)
ISimDL: Importance Sampling-Driven Acceleration of Fault Injection Simulations for Evaluating the Robustness of Deep Learning [10.757663798809144]
We propose ISimDL, a novel methodology that employs neuron sensitivity to generate importance sampling-based fault-scenarios. Our experiments show that the importance sampling provides up to 15x higher precision in selecting critical faults than the random uniform sampling, reaching such precision in less than 100 faults.
arXiv Detail & Related papers (2023-03-14T16:15:28Z)
The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property. We propose a novel approach that returns the exact count of violations. We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z)
Can pruning improve certified robustness of neural networks? [106.03070538582222]
We show that neural network pruning can improve empirical robustness of deep neural networks (NNs) Our experiments show that by appropriately pruning an NN, its certified accuracy can be boosted up to 8.2% under standard training. We additionally observe the existence of certified lottery tickets that can match both standard and certified robust accuracies of the original dense models.
arXiv Detail & Related papers (2022-06-15T05:48:51Z)
Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults. We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z)
Reliability Assessment of Neural Networks in GPUs: A Framework For Permanent Faults Injections [1.0992151305603266]
This paper proposes a framework, resorting to a binary instrumentation tool, to perform fault injection campaigns on a GPU. This environment allows for the first time assessing the reliability of CNNs deployed on a GPU considering the presence of permanent faults.
arXiv Detail & Related papers (2022-05-24T16:21:53Z)
FAT: Training Neural Networks for Reliable Inference Under Hardware Faults [3.191587417198382]
We present a novel methodology called fault-aware training (FAT), which includes error modeling during neural network (NN) training, to make QNNs resilient to specific fault models on the device. FAT has been validated for numerous classification tasks including CIFAR10, GTSRB, SVHN and ImageNet.
arXiv Detail & Related papers (2020-11-11T16:09:39Z)
FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts. In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2. We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z)
FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks [13.100954947774163]
Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields. CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage. Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components.
arXiv Detail & Related papers (2020-03-27T02:01:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.