Thales: Formulating and Estimating Architectural Vulnerability Factors
for DNN Accelerators
- URL: http://arxiv.org/abs/2212.02649v2
- Date: Sun, 7 Jan 2024 19:07:39 GMT
- Title: Thales: Formulating and Estimating Architectural Vulnerability Factors
for DNN Accelerators
- Authors: Abhishek Tyagi and Yiming Gan and Shaoshan Liu and Bo Yu and Paul
Whatmough and Yuhao Zhu
- Abstract summary: This paper focuses on quantifying the accuracy given that a transient error has occurred, which tells us how well a network behaves when a transient error occurs.
We show that existing Resiliency Accuracy (RA) formulation is fundamentally inaccurate, because it incorrectly assumes that software variables have equal faulty probability under hardware transient faults.
We present an algorithm that captures the faulty probabilities of DNN variables under transient faults and, thus, provides correct RA estimations validated by hardware.
- Score: 6.8082132475259405
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As Deep Neural Networks (DNNs) are increasingly deployed in safety critical
and privacy sensitive applications such as autonomous driving and biometric
authentication, it is critical to understand the fault-tolerance nature of
DNNs. Prior work primarily focuses on metrics such as Failures In Time (FIT)
rate and the Silent Data Corruption (SDC) rate, which quantify how often a
device fails. Instead, this paper focuses on quantifying the DNN accuracy given
that a transient error has occurred, which tells us how well a network behaves
when a transient error occurs. We call this metric Resiliency Accuracy (RA). We
show that existing RA formulation is fundamentally inaccurate, because it
incorrectly assumes that software variables (model weights/activations) have
equal faulty probability under hardware transient faults. We present an
algorithm that captures the faulty probabilities of DNN variables under
transient faults and, thus, provides correct RA estimations validated by
hardware. To accelerate RA estimation, we reformulate RA calculation as a Monte
Carlo integration problem, and solve it using importance sampling driven by DNN
specific heuristics. Using our lightweight RA estimation method, we show that
transient faults lead to far greater accuracy degradation than what todays DNN
resiliency tools estimate. We show how our RA estimation tool can help design
more resilient DNNs by integrating it with a Network Architecture Search
framework.
Related papers
- Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators.
We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN.
We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z) - Bridging Precision and Confidence: A Train-Time Loss for Calibrating
Object Detection [58.789823426981044]
We propose a novel auxiliary loss formulation that aims to align the class confidence of bounding boxes with the accurateness of predictions.
Our results reveal that our train-time loss surpasses strong calibration baselines in reducing calibration error for both in and out-domain scenarios.
arXiv Detail & Related papers (2023-03-25T08:56:21Z) - DeepVigor: Vulnerability Value Ranges and Factors for DNNs' Reliability
Assessment [1.189955933770711]
Deep Neural Networks (DNNs) and their accelerators are being deployed more frequently in safety-critical applications.
We propose a novel accurate, fine-grain, metric-oriented, and accelerator-agnostic method called DeepVigor.
arXiv Detail & Related papers (2023-03-13T08:55:10Z) - CRAFT: Criticality-Aware Fault-Tolerance Enhancement Techniques for
Emerging Memories-Based Deep Neural Networks [7.566423455230909]
Deep Neural Networks (DNNs) have emerged as the most effective programming paradigm for computer vision and natural language processing applications.
This paper proposes CRAFT, i.e., Criticality-Aware Fault-Tolerance Enhancement Techniques to enhance the reliability of NVM-based DNNs.
arXiv Detail & Related papers (2023-02-08T03:39:11Z) - The #DNN-Verification Problem: Counting Unsafe Inputs for Deep Neural
Networks [94.63547069706459]
#DNN-Verification problem involves counting the number of input configurations of a DNN that result in a violation of a safety property.
We propose a novel approach that returns the exact count of violations.
We present experimental results on a set of safety-critical benchmarks.
arXiv Detail & Related papers (2023-01-17T18:32:01Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge
Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC)
We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer.
Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z) - Exploring Fault-Energy Trade-offs in Approximate DNN Hardware
Accelerators [2.9649783577150837]
We present an extensive layer-wise and bit-wise fault resilience and energy analysis of different AxDNNs.
Our results demonstrate that the fault resilience in AxDNNs is to the energy efficiency.
arXiv Detail & Related papers (2021-01-08T05:52:12Z) - Frequentist Uncertainty in Recurrent Neural Networks via Blockwise
Influence Functions [121.10450359856242]
Recurrent neural networks (RNNs) are instrumental in modelling sequential and time-series data.
Existing approaches for uncertainty quantification in RNNs are based predominantly on Bayesian methods.
We develop a frequentist alternative that: (a) does not interfere with model training or compromise its accuracy, (b) applies to any RNN architecture, and (c) provides theoretical coverage guarantees on the estimated uncertainty intervals.
arXiv Detail & Related papers (2020-06-20T22:45:32Z) - GraN: An Efficient Gradient-Norm Based Detector for Adversarial and
Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations.
GraN is a time- and parameter-efficient method that is easily adaptable to any DNN.
GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z) - Interval Neural Networks: Uncertainty Scores [11.74565957328407]
We propose a fast, non-Bayesian method for producing uncertainty scores in the output of pre-trained deep neural networks (DNNs)
This interval neural network (INN) has interval valued parameters and propagates its input using interval arithmetic.
In numerical experiments on an image reconstruction task, we demonstrate the practical utility of INNs as a proxy for the prediction error.
arXiv Detail & Related papers (2020-03-25T18:03:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.