One-Shot Online Testing of Deep Neural Networks Based on Distribution
Shift Detection
- URL: http://arxiv.org/abs/2305.09348v1
- Date: Tue, 16 May 2023 11:06:09 GMT
- Title: One-Shot Online Testing of Deep Neural Networks Based on Distribution
Shift Detection
- Authors: Soyed Tuhin Ahmed, Mehdi B. Tahoori
- Abstract summary: We propose a emphone-shot testing approach that can test NNs accelerated on memristive crossbars with only one test vector.
Our approach can consistently achieve $100%$ fault coverage across several large topologies.
- Score: 0.6091702876917281
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural networks (NNs) are capable of learning complex patterns and
relationships in data to make predictions with high accuracy, making them
useful for various tasks. However, NNs are both computation-intensive and
memory-intensive methods, making them challenging for edge applications. To
accelerate the most common operations (matrix-vector multiplication) in NNs,
hardware accelerator architectures such as computation-in-memory (CiM) with
non-volatile memristive crossbars are utilized. Although they offer benefits
such as power efficiency, parallelism, and nonvolatility, they suffer from
various faults and variations, both during manufacturing and lifetime
operations. This can lead to faulty computations and, in turn, degradation of
post-mapping inference accuracy, which is unacceptable for many applications,
including safety-critical applications. Therefore, proper testing of NN
hardware accelerators is required. In this paper, we propose a \emph{one-shot}
testing approach that can test NNs accelerated on memristive crossbars with
only one test vector, making it very suitable for online testing applications.
Our approach can consistently achieve $100\%$ fault coverage across several
large topologies with up to $201$ layers and challenging tasks like semantic
segmentation. Nevertheless, compared to existing methods, the fault coverage is
improved by up to $24\%$, the memory overhead is only $0.0123$ MB, a reduction
of up to $19980\times$ and the number of test vectors is reduced by
$10000\times$.
Related papers
- BasisN: Reprogramming-Free RRAM-Based In-Memory-Computing by Basis Combination for Deep Neural Networks [9.170451418330696]
We propose BasisN framework to accelerate deep neural networks (DNNs) on any number of crossbars without reprogramming.
We show that cycles per inference and energy-delay product were reduced to below 1% compared with applying reprogramming on crossbars.
arXiv Detail & Related papers (2024-07-04T08:47:05Z) - Few-Shot Testing: Estimating Uncertainty of Memristive Deep Neural Networks Using One Bayesian Test Vector [0.0]
We propose a test vector generation framework that can estimate the model uncertainty of NNs implemented on memristor-based CIM hardware.
Our method is evaluated on different model dimensions, tasks, fault rates, and variation noise to show that it can consistently achieve $100%$ coverage with only $0.024$ MB of memory overhead.
arXiv Detail & Related papers (2024-05-29T08:53:16Z) - Concurrent Self-testing of Neural Networks Using Uncertainty Fingerprint [0.32634122554914]
We propose a dual head NN topology specifically designed to produce uncertainty fingerprints and the primary prediction of the NN in empha single shot.
Compared to existing works, memory overhead is reduced by up to $243.7$ MB, multiply and accumulate (MAC) operation is reduced by up to $10000times$, and false-positive rates are reduced by up to $89%$.
arXiv Detail & Related papers (2024-01-02T23:05:07Z) - Communication-Efficient Adam-Type Algorithms for Distributed Data Mining [93.50424502011626]
We propose a class of novel distributed Adam-type algorithms (emphi.e., SketchedAMSGrad) utilizing sketching.
Our new algorithm achieves a fast convergence rate of $O(frac1sqrtnT + frac1(k/d)2 T)$ with the communication cost of $O(k log(d))$ at each iteration.
arXiv Detail & Related papers (2022-10-14T01:42:05Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - Enabling certification of verification-agnostic networks via
memory-efficient semidefinite programming [97.40955121478716]
We propose a first-order dual SDP algorithm that requires memory only linear in the total number of network activations.
We significantly improve L-inf verified robust accuracy from 1% to 88% and 6% to 40% respectively.
We also demonstrate tight verification of a quadratic stability specification for the decoder of a variational autoencoder.
arXiv Detail & Related papers (2020-10-22T12:32:29Z) - AQD: Towards Accurate Fully-Quantized Object Detection [94.06347866374927]
We propose an Accurate Quantized object Detection solution, termed AQD, to get rid of floating-point computation.
Our AQD achieves comparable or even better performance compared with the full-precision counterpart under extremely low-bit schemes.
arXiv Detail & Related papers (2020-07-14T09:07:29Z) - Making Convolutions Resilient via Algorithm-Based Error Detection
Techniques [2.696566807900575]
Convolutional Neural Networks (CNNs) accurately process real-time telemetry.
CNNs must execute correctly in the presence of hardware faults.
Full duplication provides the needed assurance but incurs a 100% overhead.
arXiv Detail & Related papers (2020-06-08T23:17:57Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.