Improving Realistic Worst-Case Performance of NVCiM DNN Accelerators
through Training with Right-Censored Gaussian Noise
- URL: http://arxiv.org/abs/2307.15853v1
- Date: Sat, 29 Jul 2023 01:06:37 GMT
- Title: Improving Realistic Worst-Case Performance of NVCiM DNN Accelerators
through Training with Right-Censored Gaussian Noise
- Authors: Zheyu Yan, Yifan Qin, Wujie Wen, Xiaobo Sharon Hu, Yiyu Shi
- Abstract summary: We propose to use the k-th percentile performance (KPP) to capture the realistic worst-case performance of DNN models executing on CiM accelerators.
Our method achieves up to a 26% improvement in KPP compared to the state-of-the-art methods employed to enhance robustness under the impact of device variations.
- Score: 16.470952550714394
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Compute-in-Memory (CiM), built upon non-volatile memory (NVM) devices, is
promising for accelerating deep neural networks (DNNs) owing to its in-situ
data processing capability and superior energy efficiency. Unfortunately, the
well-trained model parameters, after being mapped to NVM devices, can often
exhibit large deviations from their intended values due to device variations,
resulting in notable performance degradation in these CiM-based DNN
accelerators. There exists a long list of solutions to address this issue.
However, they mainly focus on improving the mean performance of CiM DNN
accelerators. How to guarantee the worst-case performance under the impact of
device variations, which is crucial for many safety-critical applications such
as self-driving cars, has been far less explored. In this work, we propose to
use the k-th percentile performance (KPP) to capture the realistic worst-case
performance of DNN models executing on CiM accelerators. Through a formal
analysis of the properties of KPP and the noise injection-based DNN training,
we demonstrate that injecting a novel right-censored Gaussian noise, as opposed
to the conventional Gaussian noise, significantly improves the KPP of DNNs. We
further propose an automated method to determine the optimal hyperparameters
for injecting this right-censored Gaussian noise during the training process.
Our method achieves up to a 26% improvement in KPP compared to the
state-of-the-art methods employed to enhance DNN robustness under the impact of
device variations.
Related papers
- DiSK: Differentially Private Optimizer with Simplified Kalman Filter for Noise Reduction [57.83978915843095]
This paper introduces DiSK, a novel framework designed to significantly enhance the performance of differentially private gradients.
To ensure practicality for large-scale training, we simplify the Kalman filtering process, minimizing its memory and computational demands.
arXiv Detail & Related papers (2024-10-04T19:30:39Z) - Compute-in-Memory based Neural Network Accelerators for Safety-Critical
Systems: Worst-Case Scenarios and Protections [8.813981342105151]
We study the problem of pinpointing the worst-case performance of CiM accelerators affected by device variations.
We propose a novel worst-case-aware training technique named A-TRICE that efficiently combines adversarial training and noise-injection training.
Our experimental results demonstrate that A-TRICE improves the worst-case accuracy under device variations by up to 33%.
arXiv Detail & Related papers (2023-12-11T05:56:00Z) - EPIM: Efficient Processing-In-Memory Accelerators based on Epitome [78.79382890789607]
We introduce the Epitome, a lightweight neural operator offering convolution-like functionality.
On the software side, we evaluate epitomes' latency and energy on PIM accelerators.
We introduce a PIM-aware layer-wise design method to enhance their hardware efficiency.
arXiv Detail & Related papers (2023-11-12T17:56:39Z) - Cal-DETR: Calibrated Detection Transformer [67.75361289429013]
We propose a mechanism for calibrated detection transformers (Cal-DETR), particularly for Deformable-DETR, UP-DETR and DINO.
We develop an uncertainty-guided logit modulation mechanism that leverages the uncertainty to modulate the class logits.
Results corroborate the effectiveness of Cal-DETR against the competing train-time methods in calibrating both in-domain and out-domain detections.
arXiv Detail & Related papers (2023-11-06T22:13:10Z) - Negative Feedback Training: A Novel Concept to Improve Robustness of NVCIM DNN Accelerators [11.832487701641723]
Non-volatile memory (NVM) devices excel in energy efficiency and latency when performing Deep Neural Network (DNN) inference.
We propose a novel training concept: Negative Feedback Training (NFT) leveraging the multi-scale noisy information captured from network.
Our methods outperform existing state-of-the-art methods with up to a 46.71% improvement in inference accuracy.
arXiv Detail & Related papers (2023-05-23T22:56:26Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - Energy-efficient DNN Inference on Approximate Accelerators Through
Formal Property Exploration [1.0323063834827415]
We present an automated framework for weight-to-approximation mapping for approximate Deep Neural Networks (DNNs)
At the MAC unit level, our evaluation surpassed already energy-efficient mappings by more than $times2$ in terms of energy gains.
arXiv Detail & Related papers (2022-07-25T17:07:00Z) - Computing-In-Memory Neural Network Accelerators for Safety-Critical
Systems: Can Small Device Variations Be Disastrous? [15.760502065894778]
NVM devices suffer from various non-idealities, especially device-to-device variations due to fabrication defects and cycle-to-cycle variations due to the behavior of devices.
We propose a method to effectively find the specific combination of device variation in the high-dimensional space that leads to the worst-case performance.
arXiv Detail & Related papers (2022-07-15T17:38:01Z) - MemSE: Fast MSE Prediction for Noisy Memristor-Based DNN Accelerators [5.553959304125023]
We theoretically analyze the mean squared error of DNNs that use memristors to compute matrix-vector multiplications (MVM)
We take into account both the quantization noise, due to the necessity of reducing the DNN model size, and the programming noise, stemming from the variability during the programming of the memristance value.
The proposed method is almost two order of magnitude faster than Monte-Carlo simulation, thus making it possible to optimize the implementation parameters to achieve minimal error for a given power constraint.
arXiv Detail & Related papers (2022-05-03T18:10:43Z) - Automatic Mapping of the Best-Suited DNN Pruning Schemes for Real-Time
Mobile Acceleration [71.80326738527734]
We propose a general, fine-grained structured pruning scheme and corresponding compiler optimizations.
We show that our pruning scheme mapping methods, together with the general fine-grained structured pruning scheme, outperform the state-of-the-art DNN optimization framework.
arXiv Detail & Related papers (2021-11-22T23:53:14Z) - Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors.
This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.