Related papers: Fast and Accurate Error Simulation for CNNs against Soft Errors

Fast and Accurate Error Simulation for CNNs against Soft Errors

URL: http://arxiv.org/abs/2206.02051v1
Date: Sat, 4 Jun 2022 19:45:02 GMT
Title: Fast and Accurate Error Simulation for CNNs against Soft Errors
Authors: Cristiana Bolchini and Luca Cassano and Antonio Miele and Alessandro Toschi
Abstract summary: We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults. We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
Score: 64.54260986994163
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The great quest for adopting AI-based computation for safety-/mission-critical applications motivates the interest towards methods for assessing the robustness of the application w.r.t. not only its training/tuning but also errors due to faults, in particular soft errors, affecting the underlying hardware. Two strategies exist: architecture-level fault injection and application-level functional error simulation. We present a framework for the reliability analysis of Convolutional Neural Networks (CNNs) via an error simulation engine that exploits a set of validated error models extracted from a detailed fault injection campaign. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults and bridge the gap between fault injection and error simulation, exploiting the advantages of both approaches. We compared our methodology against SASSIFI for the accuracy of functional error simulation w.r.t. fault injection, and against TensorFI in terms of speedup for the error simulation strategy. Experimental results show that our methodology achieves about 99\% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t. TensorFI, that only implements a limited set of error models.

Related papers

Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency [1.7205106391379026]
We introduce a novel fault injection framework called PyTorchALFI (Application Level Fault Injection for PyTorch) based on PyTorchFI. PyTorchALFI provides an efficient way to define randomly generated and reusable sets of faults to inject into PyTorch models.
arXiv Detail & Related papers (2023-10-30T11:18:35Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
ISimDL: Importance Sampling-Driven Acceleration of Fault Injection Simulations for Evaluating the Robustness of Deep Learning [10.757663798809144]
We propose ISimDL, a novel methodology that employs neuron sensitivity to generate importance sampling-based fault-scenarios. Our experiments show that the importance sampling provides up to 15x higher precision in selecting critical faults than the random uniform sampling, reaching such precision in less than 100 faults.
arXiv Detail & Related papers (2023-03-14T16:15:28Z)
ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Neural Network Processing [7.578258600530223]
Algorithm-based fault tolerance (ABFT) mechanisms have become a promising solution for reliability enhancement. We propose an Approximate ABFT framework that introduces adaptive error tolerance thresholds to enable selective fault recovery. The proposed ApproxABFT achieves a 43.39% average reduction in redundant computing overhead compared to previous accurate ABFT.
arXiv Detail & Related papers (2023-02-21T06:21:28Z)
Simulation-to-reality UAV Fault Diagnosis with Deep Learning [20.182411473467656]
We propose a deep learning model that addresses the simulation-to-reality gap in fault diagnosis of quadrotors. Our proposed approach achieves an accuracy of 96% in detecting propeller faults. This is the first reliable and efficient method for simulation-to-reality fault diagnosis of quadrotor propellers.
arXiv Detail & Related papers (2023-02-09T02:37:48Z)
Statistical Modeling of Soft Error Influence on Neural Networks [12.298356981085316]
We develop a series of statistical models to analyze the behavior of NN models under soft errors in general. The statistical models reveal not only the correlation between soft errors and NN model accuracy, but also how NN parameters such as quantization and architecture affect the reliability of NNs.
arXiv Detail & Related papers (2022-10-12T02:28:21Z)
Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot. We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z)
Leveraging Unlabeled Data to Predict Out-of-Distribution Performance [63.740181251997306]
Real-world machine learning deployments are characterized by mismatches between the source (training) and target (test) distributions. In this work, we investigate methods for predicting the target domain accuracy using only labeled source data and unlabeled target data. We propose Average Thresholded Confidence (ATC), a practical method that learns a threshold on the model's confidence, predicting accuracy as the fraction of unlabeled examples.
arXiv Detail & Related papers (2022-01-11T23:01:12Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks [13.100954947774163]
Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields. CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage. Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components.
arXiv Detail & Related papers (2020-03-27T02:01:54Z)
Uncertainty Estimation Using a Single Deep Deterministic Neural Network [66.26231423824089]
We propose a method for training a deterministic deep model that can find and reject out of distribution data points at test time with a single forward pass. We scale training in these with a novel loss function and centroid updating scheme and match the accuracy of softmax models.
arXiv Detail & Related papers (2020-03-04T12:27:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.