Related papers: Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency

Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency

URL: http://arxiv.org/abs/2310.19449v1
Date: Mon, 30 Oct 2023 11:18:35 GMT
Title: Large-Scale Application of Fault Injection into PyTorch Models -- an Extension to PyTorchFI for Validation Efficiency
Authors: Ralf Graafe, Qutub Syed Sha, Florian Geissler, Michael Paulitsch
Abstract summary: We introduce a novel fault injection framework called PyTorchALFI (Application Level Fault Injection for PyTorch) based on PyTorchFI. PyTorchALFI provides an efficient way to define randomly generated and reusable sets of faults to inject into PyTorch models.
Score: 1.7205106391379026
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transient or permanent faults in hardware can render the output of Neural Networks (NN) incorrect without user-specific traces of the error, i.e. silent data errors (SDE). On the other hand, modern NNs also possess an inherent redundancy that can tolerate specific faults. To establish a safety case, it is necessary to distinguish and quantify both types of corruptions. To study the effects of hardware (HW) faults on software (SW) in general and NN models in particular, several fault injection (FI) methods have been established in recent years. Current FI methods focus on the methodology of injecting faults but often fall short of accounting for large-scale FI tests, where many fault locations based on a particular fault model need to be analyzed in a short time. Results need to be concise, repeatable, and comparable. To address these requirements and enable fault injection as the default component in a machine learning development cycle, we introduce a novel fault injection framework called PyTorchALFI (Application Level Fault Injection for PyTorch) based on PyTorchFI. PyTorchALFI provides an efficient way to define randomly generated and reusable sets of faults to inject into PyTorch models, defines complex test scenarios, enhances data sets, and generates test KPIs while tightly coupling fault-free, faulty, and modified NN. In this paper, we provide details about the definition of test scenarios, software architecture, and several examples of how to use the new framework to apply iterative changes in fault location and number, compare different model modifications, and analyze test results.

Related papers

SpikeFI: A Fault Injection Framework for Spiking Neural Networks [0.3413711585591077]
SpikeFI is a fault injection framework for spiking neural networks (SNNs) It can be used for automating the reliability analysis and test generation. SpikeFI is open-source and available for download via GitHub at https:// GitHub.com/SpikeFI.
arXiv Detail & Related papers (2024-11-22T12:08:06Z)
Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure. First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module. The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z)
MRFI: An Open Source Multi-Resolution Fault Injection Framework for Neural Network Processing [8.871260896931211]
MRFI is a highly multi-resolution fault injection tool for deep neural networks. It integrates extensive fault analysis functionalities from different perspectives. It does not modify the major neural network computing framework of PyTorch.
arXiv Detail & Related papers (2023-06-20T06:46:54Z)
Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions. Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z)
ISimDL: Importance Sampling-Driven Acceleration of Fault Injection Simulations for Evaluating the Robustness of Deep Learning [10.757663798809144]
We propose ISimDL, a novel methodology that employs neuron sensitivity to generate importance sampling-based fault-scenarios. Our experiments show that the importance sampling provides up to 15x higher precision in selecting critical faults than the random uniform sampling, reaching such precision in less than 100 faults.
arXiv Detail & Related papers (2023-03-14T16:15:28Z)
enpheeph: A Fault Injection Framework for Spiking and Compressed Deep Neural Networks [10.757663798809144]
We present enpheeph, a Fault Injection Framework for Spiking and Compressed Deep Neural Networks (DNNs) By injecting a random and increasing number of faults, we show that DNNs can show a reduction in accuracy with a fault rate as low as 7 x 10 (-7) faults per parameter, with an accuracy drop higher than 40%.
arXiv Detail & Related papers (2022-07-31T00:30:59Z)
Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults. We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z)
Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates. We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters. We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs [4.854070123523902]
We introduce two new Fault Injection (FI) frameworks for evaluating how Deep Learning (DL) components operate under the presence of random faults. In this paper, we present the results of FI experiments conducted on four VGG-based Convolutional NNs using two image sets. Results help to identify the most critical operations and layers, compare the reliability characteristics of functionally similar NNs, and introduce selective fault tolerance mechanisms.
arXiv Detail & Related papers (2020-12-13T11:16:25Z)
NADS: Neural Architecture Distribution Search for Uncertainty Awareness [79.18710225716791]
Machine learning (ML) systems often encounter Out-of-Distribution (OoD) errors when dealing with testing data coming from a distribution different from training data. Existing OoD detection approaches are prone to errors and even sometimes assign higher likelihoods to OoD samples. We propose Neural Architecture Distribution Search (NADS) to identify common building blocks among all uncertainty-aware architectures.
arXiv Detail & Related papers (2020-06-11T17:39:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.