Computing-In-Memory Neural Network Accelerators for Safety-Critical
Systems: Can Small Device Variations Be Disastrous?
- URL: http://arxiv.org/abs/2207.07626v1
- Date: Fri, 15 Jul 2022 17:38:01 GMT
- Title: Computing-In-Memory Neural Network Accelerators for Safety-Critical
Systems: Can Small Device Variations Be Disastrous?
- Authors: Zheyu Yan, Xiaobo Sharon Hu, Yiyu Shi
- Abstract summary: NVM devices suffer from various non-idealities, especially device-to-device variations due to fabrication defects and cycle-to-cycle variations due to the behavior of devices.
We propose a method to effectively find the specific combination of device variation in the high-dimensional space that leads to the worst-case performance.
- Score: 15.760502065894778
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Computing-in-Memory (CiM) architectures based on emerging non-volatile memory
(NVM) devices have demonstrated great potential for deep neural network (DNN)
acceleration thanks to their high energy efficiency. However, NVM devices
suffer from various non-idealities, especially device-to-device variations due
to fabrication defects and cycle-to-cycle variations due to the stochastic
behavior of devices. As such, the DNN weights actually mapped to NVM devices
could deviate significantly from the expected values, leading to large
performance degradation. To address this issue, most existing works focus on
maximizing average performance under device variations. This objective would
work well for general-purpose scenarios. But for safety-critical applications,
the worst-case performance must also be considered. Unfortunately, this has
been rarely explored in the literature. In this work, we formulate the problem
of determining the worst-case performance of CiM DNN accelerators under the
impact of device variations. We further propose a method to effectively find
the specific combination of device variation in the high-dimensional space that
leads to the worst-case performance. We find that even with very small device
variations, the accuracy of a DNN can drop drastically, causing concerns when
deploying CiM accelerators in safety-critical applications. Finally, we show
that surprisingly none of the existing methods used to enhance average DNN
performance in CiM accelerators are very effective when extended to enhance the
worst-case performance, and further research down the road is needed to address
this problem.
Related papers
- TSB: Tiny Shared Block for Efficient DNN Deployment on NVCIM Accelerators [11.496631244103773]
"Tiny Shared Block (TSB)" integrates a small shared 1x1 convolution block into the Deep Neural Network architecture.
TSB achieves over 20x inference accuracy gap improvement, over 5x training speedup, and weights-to-device mapping cost reduction.
arXiv Detail & Related papers (2024-05-08T20:53:38Z) - Compute-in-Memory based Neural Network Accelerators for Safety-Critical
Systems: Worst-Case Scenarios and Protections [8.813981342105151]
We study the problem of pinpointing the worst-case performance of CiM accelerators affected by device variations.
We propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training.
Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.
arXiv Detail & Related papers (2023-12-11T05:56:00Z) - Scaling #DNN-Verification Tools with Efficient Bound Propagation and
Parallel Computing [57.49021927832259]
Deep Neural Networks (DNNs) are powerful tools that have shown extraordinary results in many scenarios.
However, their intricate designs and lack of transparency raise safety concerns when applied in real-world applications.
Formal Verification (FV) of DNNs has emerged as a valuable solution to provide provable guarantees on the safety aspect.
arXiv Detail & Related papers (2023-12-10T13:51:25Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Improving Realistic Worst-Case Performance of NVCiM DNN Accelerators
through Training with Right-Censored Gaussian Noise [16.470952550714394]
We propose to use the k-th percentile performance (KPP) to capture the realistic worst-case performance of DNN models executing on CiM accelerators.
Our method achieves up to a 26% improvement in KPP compared to the state-of-the-art methods employed to enhance robustness under the impact of device variations.
arXiv Detail & Related papers (2023-07-29T01:06:37Z) - Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators.
We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN.
We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z) - Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators [11.832487701641723]
Non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference.
We propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network.
Our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy.
arXiv Detail & Related papers (2023-05-23T22:56:26Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - Towards Real-Time DNN Inference on Mobile Platforms with Model Pruning
and Compiler Optimization [56.3111706960878]
High-end mobile platforms serve as primary computing devices for a wide range of Deep Neural Network (DNN) applications.
constrained computation and storage resources on these devices pose significant challenges for real-time inference executions.
We propose a set of hardware-friendly structured model pruning and compiler optimization techniques to accelerate DNN executions on mobile devices.
arXiv Detail & Related papers (2020-04-22T03:18:23Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.