ISimDL: Importance Sampling-Driven Acceleration of Fault Injection
Simulations for Evaluating the Robustness of Deep Learning
- URL: http://arxiv.org/abs/2303.08035v2
- Date: Thu, 25 May 2023 07:54:27 GMT
- Title: ISimDL: Importance Sampling-Driven Acceleration of Fault Injection
Simulations for Evaluating the Robustness of Deep Learning
- Authors: Alessio Colucci, Andreas Steininger, Muhammad Shafique
- Abstract summary: We propose ISimDL, a novel methodology that employs neuron sensitivity to generate importance sampling-based fault-scenarios.
Our experiments show that the importance sampling provides up to 15x higher precision in selecting critical faults than the random uniform sampling, reaching such precision in less than 100 faults.
- Score: 10.757663798809144
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Learning (DL) systems have proliferated in many applications, requiring
specialized hardware accelerators and chips. In the nano-era, devices have
become increasingly more susceptible to permanent and transient faults.
Therefore, we need an efficient methodology for analyzing the resilience of
advanced DL systems against such faults, and understand how the faults in
neural accelerator chips manifest as errors at the DL application level, where
faults can lead to undetectable and unrecoverable errors. Using fault
injection, we can perform resilience investigations of the DL system by
modifying neuron weights and outputs at the software-level, as if the hardware
had been affected by a transient fault. Existing fault models reduce the search
space, allowing faster analysis, but requiring a-priori knowledge on the model,
and not allowing further analysis of the filtered-out search space. Therefore,
we propose ISimDL, a novel methodology that employs neuron sensitivity to
generate importance sampling-based fault-scenarios. Without any a-priori
knowledge of the model-under-test, ISimDL provides an equivalent reduction of
the search space as existing works, while allowing long simulations to cover
all the possible faults, improving on existing model requirements. Our
experiments show that the importance sampling provides up to 15x higher
precision in selecting critical faults than the random uniform sampling,
reaching such precision in less than 100 faults. Additionally, we showcase
another practical use-case for importance sampling for reliable DNN design,
namely Fault Aware Training (FAT). By using ISimDL to select the faults leading
to errors, we can insert the faults during the DNN training process to harden
the DNN against such faults. Using importance sampling in FAT reduces the
overhead required for finding faults that lead to a predetermined drop in
accuracy by more than 12x.
Related papers
- A Multimodal Lightweight Approach to Fault Diagnosis of Induction Motors in High-Dimensional Dataset [1.148237645450678]
An accurate AI-based diagnostic system for induction motors (IMs) holds the potential to enhance proactive maintenance, mitigating unplanned downtime and curbing overall maintenance costs within an industrial environment.
Researchers have proposed various fault diagnosis approaches using signal processing (SP), machine learning (ML), deep learning (DL) and hybrid architectures for BRB faults.
This paper implements large-scale data of BRB faults by using a transfer-learning-based lightweight DL model named ShuffleNetV2 for diagnosing one, two, three, and four BRB faults using current and vibration signal data.
arXiv Detail & Related papers (2025-01-07T12:40:11Z) - Designing DNNs for a trade-off between robustness and processing performance in embedded devices [1.474723404975345]
Machine learning-based embedded systems need to be robust against soft errors.
This paper investigates the suitability of using bounded AFs to improve model robustness against perturbations.
We analyze encoder-decoder fully convolutional models aimed at performing semantic segmentation tasks on hyperspectral images for scene understanding in autonomous driving.
arXiv Detail & Related papers (2024-12-04T19:34:33Z) - Analyzing Adversarial Inputs in Deep Reinforcement Learning [53.3760591018817]
We present a comprehensive analysis of the characterization of adversarial inputs, through the lens of formal verification.
We introduce a novel metric, the Adversarial Rate, to classify models based on their susceptibility to such perturbations.
Our analysis empirically demonstrates how adversarial inputs can affect the safety of a given DRL system with respect to such perturbations.
arXiv Detail & Related papers (2024-02-07T21:58:40Z) - Causal Disentanglement Hidden Markov Model for Fault Diagnosis [55.90917958154425]
We propose a Causal Disentanglement Hidden Markov model (CDHM) to learn the causality in the bearing fault mechanism.
Specifically, we make full use of the time-series data and progressively disentangle the vibration signal into fault-relevant and fault-irrelevant factors.
To expand the scope of the application, we adopt unsupervised domain adaptation to transfer the learned disentangled representations to other working environments.
arXiv Detail & Related papers (2023-08-06T05:58:45Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - Truncated tensor Schatten p-norm based approach for spatiotemporal
traffic data imputation with complicated missing patterns [77.34726150561087]
We introduce four complicated missing patterns, including missing and three fiber-like missing cases according to the mode-drivenn fibers.
Despite nonity of the objective function in our model, we derive the optimal solutions by integrating alternating data-mputation method of multipliers.
arXiv Detail & Related papers (2022-05-19T08:37:56Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - High-level Modeling of Manufacturing Faults in Deep Neural Network
Accelerators [2.6258269516366557]
Google's Unit Processing (TPU) is a neural network accelerator that uses systolic array-based matrix multiplication hardware for computation in its crux.
Manufacturing faults at any state element of the matrix multiplication unit can cause unexpected errors in these inference networks.
We propose a formal model of permanent faults and their propagation in a TPU using the Discrete-Time Markov Chain (DTMC) formalism.
arXiv Detail & Related papers (2020-06-05T18:11:14Z) - A Survey on Impact of Transient Faults on BNN Inference Accelerators [0.9667631210393929]
Big data booming enables us to easily access and analyze the highly large data sets.
Deep learning models require significant computation power and extremely high memory accesses.
In this study, we demonstrate that the impact of soft errors on a customized deep learning algorithm might cause drastic image misclassification.
arXiv Detail & Related papers (2020-04-10T16:15:55Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.