Related papers: FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks

FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks

URL: http://arxiv.org/abs/2003.12203v4
Date: Mon, 7 Sep 2020 20:34:16 GMT
Title: FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural Networks
Authors: Kai Zhao, Sheng Di, Sihuan Li, Xin Liang, Yujia Zhai, Jieyang Chen, Kaiming Ouyang, Franck Cappello, Zizhong Chen
Abstract summary: Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields. CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage. Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components.
Score: 13.100954947774163
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields. CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage. Of critical importance is ensuring the stability of the CNN inference process against soft errors. Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components, instruction duplication techniques incur high overhead, and existing algorithm-based fault tolerance (ABFT) techniques cannot protect all convolution implementations. In this paper, we focus on how to protect the CNN inference process against soft errors as efficiently as possible, with the following three contributions. (1) We propose several systematic ABFT schemes based on checksum techniques and analyze their fault protection ability and runtime thoroughly.Unlike traditional ABFT based on matrix-matrix multiplication, our schemes support any convolution implementations. (2) We design a novel workflow integrating all the proposed schemes to obtain a high detection/correction ability with limited total runtime overhead. (3) We perform our evaluation using ImageNet with well-known CNN models including AlexNet, VGG-19, ResNet-18, and YOLOv2. Experimental results demonstrate that our implementation can handle soft errors with very limited runtime overhead (4%~8% in both error-free and error-injected situations).

Related papers

Cost-Effective Fault Tolerance for CNNs Using Parameter Vulnerability Based Hardening and Pruning [0.4660328753262075]
This paper introduces a model-level hardening approach for CNNs by integrating error correction directly into the neural networks. The proposed method demonstrates fault resilience nearly equivalent to TMR-based correction but with significantly reduced overhead. Remarkably, the hardened pruned CNNs perform up to 24% faster than the hardened un-pruned ones.
arXiv Detail & Related papers (2024-05-17T09:42:44Z)
ApproxABFT: Approximate Algorithm-Based Fault Tolerance for Neural Network Processing [7.578258600530223]
Algorithm-based fault tolerance (ABFT) mechanisms have become a promising solution for reliability enhancement. We propose an Approximate ABFT framework that introduces adaptive error tolerance thresholds to enable selective fault recovery. The proposed ApproxABFT achieves a 43.39% average reduction in redundant computing overhead compared to previous accurate ABFT.
arXiv Detail & Related papers (2023-02-21T06:21:28Z)
Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine. These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults. We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z)
Performance and accuracy assessments of an incompressible fluid solver coupled with a deep Convolutional Neural Network [0.0]
The resolution of the Poisson equation is usually one of the most computationally intensive steps for incompressible fluid solvers. CNN has been introduced to solve this equation, leading to significant inference time reduction. A hybrid strategy is developed, which couples a CNN with a traditional iterative solver to ensure a user-defined accuracy level.
arXiv Detail & Related papers (2021-09-20T08:30:29Z)
Neural Architecture Dilation for Adversarial Robustness [56.18555072877193]
A shortcoming of convolutional neural networks is that they are vulnerable to adversarial attacks. This paper aims to improve the adversarial robustness of the backbone CNNs that have a satisfactory accuracy. Under a minimal computational overhead, a dilation architecture is expected to be friendly with the standard performance of the backbone CNN.
arXiv Detail & Related papers (2021-08-16T03:58:00Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF) The proposed SIDNN is compatible with a broad range of OPF schemes. It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z)
BreakingBED -- Breaking Binary and Efficient Deep Neural Networks by Adversarial Attacks [65.2021953284622]
We study robustness of CNNs against white-box and black-box adversarial attacks. Results are shown for distilled CNNs, agent-based state-of-the-art pruned models, and binarized neural networks.
arXiv Detail & Related papers (2021-03-14T20:43:19Z)
Fault Injectors for TensorFlow: Evaluation of the Impact of Random Hardware Faults on Deep CNNs [4.854070123523902]
We introduce two new Fault Injection (FI) frameworks for evaluating how Deep Learning (DL) components operate under the presence of random faults. In this paper, we present the results of FI experiments conducted on four VGG-based Convolutional NNs using two image sets. Results help to identify the most critical operations and layers, compare the reliability characteristics of functionally similar NNs, and introduce selective fault tolerance mechanisms.
arXiv Detail & Related papers (2020-12-13T11:16:25Z)
FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts. In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2. We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z)
Making Convolutions Resilient via Algorithm-Based Error Detection Techniques [2.696566807900575]
Convolutional Neural Networks (CNNs) accurately process real-time telemetry. CNNs must execute correctly in the presence of hardware faults. Full duplication provides the needed assurance but incurs a 100% overhead.
arXiv Detail & Related papers (2020-06-08T23:17:57Z)
HarDNN: Feature Map Vulnerability Evaluation in CNNs [23.24111155295923]
This paper presents HarDNN, a software-directed approach to identify vulnerable computations during a CNN inference. We show that HarDNN can accurately estimate relative vulnerability of a feature map (fmap) in CNNs using a statistical error injection campaign. Results show that the improvement in resilience for the added computation is superlinear with HarDNN.
arXiv Detail & Related papers (2020-02-22T23:05:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.