Fault Injectors for TensorFlow: Evaluation of the Impact of Random
Hardware Faults on Deep CNNs
- URL: http://arxiv.org/abs/2012.07037v1
- Date: Sun, 13 Dec 2020 11:16:25 GMT
- Title: Fault Injectors for TensorFlow: Evaluation of the Impact of Random
Hardware Faults on Deep CNNs
- Authors: Michael Beyer, Andrey Morozov, Emil Valiev, Christoph Schorn, Lydia
Gauerhof, Kai Ding, Klaus Janschek
- Abstract summary: We introduce two new Fault Injection (FI) frameworks for evaluating how Deep Learning (DL) components operate under the presence of random faults.
In this paper, we present the results of FI experiments conducted on four VGG-based Convolutional NNs using two image sets.
Results help to identify the most critical operations and layers, compare the reliability characteristics of functionally similar NNs, and introduce selective fault tolerance mechanisms.
- Score: 4.854070123523902
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Today, Deep Learning (DL) enhances almost every industrial sector, including
safety-critical areas. The next generation of safety standards will define
appropriate verification techniques for DL-based applications and propose
adequate fault tolerance mechanisms. DL-based applications, like any other
software, are susceptible to common random hardware faults such as bit flips,
which occur in RAM and CPU registers. Such faults can lead to silent data
corruption. Therefore, it is crucial to develop methods and tools that help to
evaluate how DL components operate under the presence of such faults. In this
paper, we introduce two new Fault Injection (FI) frameworks InjectTF and
InjectTF2 for TensorFlow 1 and TensorFlow 2, respectively. Both frameworks are
available on GitHub and allow the configurable injection of random faults into
Neural Networks (NN). In order to demonstrate the feasibility of the
frameworks, we also present the results of FI experiments conducted on four
VGG-based Convolutional NNs using two image sets. The results demonstrate how
random bit flips in the output of particular mathematical operations and layers
of NNs affect the classification accuracy. These results help to identify the
most critical operations and layers, compare the reliability characteristics of
functionally similar NNs, and introduce selective fault tolerance mechanisms.
Related papers
- DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications.
BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks.
We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z) - Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO.
We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits.
Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z) - Large-Scale Application of Fault Injection into PyTorch Models -- an
Extension to PyTorchFI for Validation Efficiency [1.7205106391379026]
We introduce a novel fault injection framework called PyTorchALFI (Application Level Fault Injection for PyTorch) based on PyTorchFI.
PyTorchALFI provides an efficient way to define randomly generated and reusable sets of faults to inject into PyTorch models.
arXiv Detail & Related papers (2023-10-30T11:18:35Z) - ISimDL: Importance Sampling-Driven Acceleration of Fault Injection
Simulations for Evaluating the Robustness of Deep Learning [10.757663798809144]
We propose ISimDL, a novel methodology that employs neuron sensitivity to generate importance sampling-based fault-scenarios.
Our experiments show that the importance sampling provides up to 15x higher precision in selecting critical faults than the random uniform sampling, reaching such precision in less than 100 faults.
arXiv Detail & Related papers (2023-03-14T16:15:28Z) - The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural
Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property.
We propose a novel approach that returns the exact count of violations.
We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z) - Can pruning improve certified robustness of neural networks? [106.03070538582222]
We show that neural network pruning can improve empirical robustness of deep neural networks (NNs)
Our experiments show that by appropriately pruning an NN, its certified accuracy can be boosted up to 8.2% under standard training.
We additionally observe the existence of certified lottery tickets that can match both standard and certified robust accuracies of the original dense models.
arXiv Detail & Related papers (2022-06-15T05:48:51Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - Reliability Assessment of Neural Networks in GPUs: A Framework For
Permanent Faults Injections [1.0992151305603266]
This paper proposes a framework, resorting to a binary instrumentation tool, to perform fault injection campaigns on a GPU.
This environment allows for the first time assessing the reliability of CNNs deployed on a GPU considering the presence of permanent faults.
arXiv Detail & Related papers (2022-05-24T16:21:53Z) - FAT: Training Neural Networks for Reliable Inference Under Hardware
Faults [3.191587417198382]
We present a novel methodology called fault-aware training (FAT), which includes error modeling during neural network (NN) training, to make QNNs resilient to specific fault models on the device.
FAT has been validated for numerous classification tasks including CIFAR10, GTSRB, SVHN and ImageNet.
arXiv Detail & Related papers (2020-11-11T16:09:39Z) - FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.
In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2.
We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z) - FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural
Networks [13.100954947774163]
Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields.
CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage.
Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components.
arXiv Detail & Related papers (2020-03-27T02:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.