FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep
Neural Networks
- URL: http://arxiv.org/abs/2301.00675v1
- Date: Thu, 29 Dec 2022 06:06:14 GMT
- Title: FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep
Neural Networks
- Authors: Akul Malhotra and Sumeet Kumar Gupta
- Abstract summary: We investigate the impact of bit-flip and stuck-at faults on activation-sparse quantized DNNs (QDNNs)
We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults.
We propose the mitigation of the impact of faults by employing a sharpness-aware quantization scheme.
- Score: 0.03807314298073299
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Model compression via quantization and sparsity enhancement has gained an
immense interest to enable the deployment of deep neural networks (DNNs) in
resource-constrained edge environments. Although these techniques have shown
promising results in reducing the energy, latency and memory requirements of
the DNNs, their performance in non-ideal real-world settings (such as in the
presence of hardware faults) is yet to be completely understood. In this paper,
we investigate the impact of bit-flip and stuck-at faults on activation-sparse
quantized DNNs (QDNNs). We show that a high level of activation sparsity comes
at the cost of larger vulnerability to faults. For instance, activation-sparse
QDNNs exhibit up to 17.32% lower accuracy than the standard QDNNs. We also
establish that one of the major cause of the degraded accuracy is sharper
minima in the loss landscape for activation-sparse QDNNs, which makes them more
sensitive to perturbations in the weight values due to faults. Based on this
observation, we propose the mitigation of the impact of faults by employing a
sharpness-aware quantization (SAQ) training scheme. The activation-sparse and
standard QDNNs trained with SAQ have up to 36.71% and 24.76% higher inference
accuracy, respectively compared to their conventionally trained equivalents.
Moreover, we show that SAQ-trained activation-sparse QDNNs show better accuracy
in faulty settings than standard QDNNs trained conventionally. Thus the
proposed technique can be instrumental in achieving sparsity-related
energy/latency benefits without compromising on fault tolerance.
Related papers
- Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training [0.0]
We investigate the impact of memory faults on activation-sparse quantized DNNs (AS QDNNs)
We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults, with AS QDNNs exhibiting up to 11.13% lower accuracy than the standard QDNNs.
We employ sharpness-aware quantization (SAQ) training to mitigate the impact of memory faults.
arXiv Detail & Related papers (2024-06-15T06:40:48Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - RescueSNN: Enabling Reliable Executions on Spiking Neural Network
Accelerators under Permanent Faults [15.115813664357436]
RescueSNN is a novel methodology to mitigate permanent faults in the compute engine of SNN chips.
RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate.
arXiv Detail & Related papers (2023-04-08T15:24:57Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - SoftSNN: Low-Cost Fault Tolerance for Spiking Neural Network
Accelerators under Soft Errors [15.115813664357436]
SoftSNN is a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution.
For a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively.
arXiv Detail & Related papers (2022-03-10T18:20:28Z) - HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep
Spiking Neural Networks by Training with Crafted Input Noise [13.904091056365765]
We present an SNN training algorithm that uses crafted input noise and incurs no additional training time.
Compared to standard trained direct input SNNs, our trained models yield improved classification accuracy of up to 13.7%.
Our models also outperform inherently robust SNNs trained on rate-coded inputs with improved or similar classification performance on attack-generated images.
arXiv Detail & Related papers (2021-10-06T16:48:48Z) - Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided
Compression [12.37129078618206]
Deep spiking neural networks (SNNs) have emerged as a potential alternative to traditional deep learning frameworks.
Most SNN training frameworks yield large inference latency which translates to increased spike activity and reduced energy efficiency.
This paper presents a non-iterative SNN training technique thatachieves ultra-high compression with reduced spiking activity.
arXiv Detail & Related papers (2021-07-16T18:23:36Z) - Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure
DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly.
This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z) - S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural
Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution.
Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs.
Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z) - Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors.
This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.