Related papers: Compute-in-Memory based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections

Compute-in-Memory based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections

URL: http://arxiv.org/abs/2312.06137v1
Date: Mon, 11 Dec 2023 05:56:00 GMT
Title: Compute-in-Memory based Neural Network Accelerators for Safety-Critical Systems: Worst-Case Scenarios and Protections
Authors: Zheyu Yan, Xiaobo Sharon Hu, Yiyu Shi
Abstract summary: We study the problem of pinpointing the worst-case performance of CiM accelerators affected by device variations. We propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training. Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.
Score: 8.813981342105151
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Emerging non-volatile memory (NVM)-based Computing-in-Memory (CiM) architectures show substantial promise in accelerating deep neural networks (DNNs) due to their exceptional energy efficiency. However, NVM devices are prone to device variations. Consequently, the actual DNN weights mapped to NVM devices can differ considerably from their targeted values, inducing significant performance degradation. Many existing solutions aim to optimize average performance amidst device variations, which is a suitable strategy for general-purpose conditions. However, the worst-case performance that is crucial for safety-critical applications is largely overlooked in current research. In this study, we define the problem of pinpointing the worst-case performance of CiM DNN accelerators affected by device variations. Additionally, we introduce a strategy to identify a specific pattern of the device value deviations in the complex, high-dimensional value deviation space, responsible for this worst-case outcome. Our findings reveal that even subtle device variations can precipitate a dramatic decline in DNN accuracy, posing risks for CiM-based platforms in supporting safety-critical applications. Notably, we observe that prevailing techniques to bolster average DNN performance in CiM accelerators fall short in enhancing worst-case scenarios. In light of this issue, we propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training with right-censored Gaussian noise to improve the DNN accuracy in the worst-case scenarios. Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.

Related papers

Synergistic Development of Perovskite Memristors and Algorithms for Robust Analog Computing [53.77822620185878]
We propose a synergistic methodology to concurrently optimize perovskite memristor fabrication and develop robust analog DNNs. We develop "BayesMulti", a training strategy utilizing BO-guided noise injection to improve the resistance of analog DNNs to memristor imperfections. Our integrated approach enables use of analog computing in much deeper and wider networks, achieving up to 100-fold improvements.
arXiv Detail & Related papers (2024-12-03T19:20:08Z)
Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO. We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits. Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
Improving Realistic Worst-Case Performance of NVCiM DNN Accelerators through Training with Right-Censored Gaussian Noise [16.470952550714394]
We propose to use the k-th percentile performance (KPP) to capture the realistic worst-case performance of DNN models executing on CiM accelerators. Our method achieves up to a 26% improvement in KPP compared to the state-of-the-art methods employed to enhance robustness under the impact of device variations.
arXiv Detail & Related papers (2023-07-29T01:06:37Z)
Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators. We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN. We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z)
Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators [11.832487701641723]
Non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference. We propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network. Our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy.
arXiv Detail & Related papers (2023-05-23T22:56:26Z)
Computing-In-Memory Neural Network Accelerators for Safety-Critical Systems: Can Small Device Variations Be Disastrous? [15.760502065894778]
NVM devices suffer from various non-idealities, especially device-to-device variations due to fabrication defects and cycle-to-cycle variations due to the behavior of devices. We propose a method to effectively find the specific combination of device variation in the high-dimensional space that leads to the worst-case performance.
arXiv Detail & Related papers (2022-07-15T17:38:01Z)
Fault-Aware Design and Training to Enhance DNNs Reliability with Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements. Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically. In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations. We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z)
Uncertainty Modeling of Emerging Device-based Computing-in-Memory Neural Accelerators with Application to Neural Architecture Search [25.841113960607334]
Emerging device-based Computing-in-memory (CiM) has been proved to be a promising candidate for high-energy efficiency deep neural network (DNN) computations. Most emerging devices suffer uncertainty issues, resulting in a difference between actual data stored and the weight value it is designed to be. This leads to an accuracy drop from trained models to actually deployed platforms.
arXiv Detail & Related papers (2021-07-06T23:29:36Z)
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space. With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.