Making Convolutions Resilient via Algorithm-Based Error Detection
Techniques
- URL: http://arxiv.org/abs/2006.04984v1
- Date: Mon, 8 Jun 2020 23:17:57 GMT
- Title: Making Convolutions Resilient via Algorithm-Based Error Detection
Techniques
- Authors: Siva Kumar Sastry Hari, Michael B. Sullivan, Timothy Tsai, and Stephen
W. Keckler
- Abstract summary: Convolutional Neural Networks (CNNs) accurately process real-time telemetry.
CNNs must execute correctly in the presence of hardware faults.
Full duplication provides the needed assurance but incurs a 100% overhead.
- Score: 2.696566807900575
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability of Convolutional Neural Networks (CNNs) to accurately process
real-time telemetry has boosted their use in safety-critical and
high-performance computing systems. As such systems require high levels of
resilience to errors, CNNs must execute correctly in the presence of hardware
faults. Full duplication provides the needed assurance but incurs a prohibitive
100% overhead. Algorithmic techniques are known to offer low-cost solutions,
but the practical feasibility and performance of such techniques have never
been studied for CNN deployment platforms (e.g., TensorFlow or TensorRT on
GPUs). In this paper, we focus on algorithmically verifying Convolutions, which
are the most resource-demanding operations in CNNs. We use checksums to verify
convolutions, adding a small amount of redundancy, far less than
full-duplication. We first identify the challenges that arise in employing
Algorithm-Based Error Detection (ABED) for Convolutions in optimized inference
platforms that fuse multiple network layers and use reduced-precision
operations, and demonstrate how to overcome them. We propose and evaluate
variations of ABED techniques that offer implementation complexity, runtime
overhead, and coverage trade-offs. Results show that ABED can detect all
transient hardware errors that might otherwise corrupt output and does so while
incurring low runtime overheads (6-23%), offering at least 1.6X throughput to
workloads compared to full duplication.
Related papers
- Global Context Aggregation Network for Lightweight Saliency Detection of
Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure.
First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module.
The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z) - A Generalization of Continuous Relaxation in Structured Pruning [0.3277163122167434]
Trends indicate that deeper and larger neural networks with an increasing number of parameters achieve higher accuracy than smaller neural networks.
We generalize structured pruning with algorithms for network augmentation, pruning, sub-network collapse and removal.
The resulting CNN executes efficiently on GPU hardware without computationally expensive sparse matrix operations.
arXiv Detail & Related papers (2023-08-28T14:19:13Z) - One-Shot Online Testing of Deep Neural Networks Based on Distribution
Shift Detection [0.6091702876917281]
We propose a emphone-shot testing approach that can test NNs accelerated on memristive crossbars with only one test vector.
Our approach can consistently achieve $100%$ fault coverage across several large topologies.
arXiv Detail & Related papers (2023-05-16T11:06:09Z) - QVIP: An ILP-based Formal Verification Approach for Quantized Neural
Networks [14.766917269393865]
Quantization has emerged as a promising technique to reduce the size of neural networks with comparable accuracy as their floating-point numbered counterparts.
We propose a novel and efficient formal verification approach for QNNs.
In particular, we are the first to propose an encoding that reduces the verification problem of QNNs into the solving of integer linear constraints.
arXiv Detail & Related papers (2022-12-10T03:00:29Z) - AEGNN: Asynchronous Event-based Graph Neural Networks [54.528926463775946]
Event-based Graph Neural Networks generalize standard GNNs to process events as "evolving"-temporal graphs.
AEGNNs are easily trained on synchronous inputs and can be converted to efficient, "asynchronous" networks at test time.
arXiv Detail & Related papers (2022-03-31T16:21:12Z) - Enabling certification of verification-agnostic networks via
memory-efficient semidefinite programming [97.40955121478716]
We propose a first-order dual SDP algorithm that requires memory only linear in the total number of network activations.
We significantly improve L-inf verified robust accuracy from 1% to 88% and 6% to 40% respectively.
We also demonstrate tight verification of a quadratic stability specification for the decoder of a variational autoencoder.
arXiv Detail & Related papers (2020-10-22T12:32:29Z) - FATNN: Fast and Accurate Ternary Neural Networks [89.07796377047619]
Ternary Neural Networks (TNNs) have received much attention due to being potentially orders of magnitude faster in inference, as well as more power efficient, than full-precision counterparts.
In this work, we show that, under some mild constraints, computational complexity of the ternary inner product can be reduced by a factor of 2.
We elaborately design an implementation-dependent ternary quantization algorithm to mitigate the performance gap.
arXiv Detail & Related papers (2020-08-12T04:26:18Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Efficient Integer-Arithmetic-Only Convolutional Neural Networks [87.01739569518513]
We replace conventional ReLU with Bounded ReLU and find that the decline is due to activation quantization.
Our integer networks achieve equivalent performance as the corresponding FPN networks, but have only 1/4 memory cost and run 2x faster on modern GPU.
arXiv Detail & Related papers (2020-06-21T08:23:03Z) - FT-CNN: Algorithm-Based Fault Tolerance for Convolutional Neural
Networks [13.100954947774163]
Convolutional neural networks (CNNs) are becoming more and more important for solving challenging and critical problems in many fields.
CNN inference applications have been deployed in safety-critical systems, which may suffer from soft errors caused by high-energy particles, high temperature, or abnormal voltage.
Traditional fault tolerance methods are not suitable for CNN inference because error-correcting code is unable to protect computational components.
arXiv Detail & Related papers (2020-03-27T02:01:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.