Related papers: FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks

FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks

URL: http://arxiv.org/abs/2301.00675v1
Date: Thu, 29 Dec 2022 06:06:14 GMT
Title: FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks
Authors: Akul Malhotra and Sumeet Kumar Gupta
Abstract summary: We investigate the impact of bit-flip and stuck-at faults on activation-sparse quantized DNNs (QDNNs) We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults. We propose the mitigation of the impact of faults by employing a sharpness-aware quantization scheme.
Score: 0.03807314298073299
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Model compression via quantization and sparsity enhancement has gained an immense interest to enable the deployment of deep neural networks (DNNs) in resource-constrained edge environments. Although these techniques have shown promising results in reducing the energy, latency and memory requirements of the DNNs, their performance in non-ideal real-world settings (such as in the presence of hardware faults) is yet to be completely understood. In this paper, we investigate the impact of bit-flip and stuck-at faults on activation-sparse quantized DNNs (QDNNs). We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults. For instance, activation-sparse QDNNs exhibit up to 17.32% lower accuracy than the standard QDNNs. We also establish that one of the major cause of the degraded accuracy is sharper minima in the loss landscape for activation-sparse QDNNs, which makes them more sensitive to perturbations in the weight values due to faults. Based on this observation, we propose the mitigation of the impact of faults by employing a sharpness-aware quantization (SAQ) training scheme. The activation-sparse and standard QDNNs trained with SAQ have up to 36.71% and 24.76% higher inference accuracy, respectively compared to their conventionally trained equivalents. Moreover, we show that SAQ-trained activation-sparse QDNNs show better accuracy in faulty settings than standard QDNNs trained conventionally. Thus the proposed technique can be instrumental in achieving sparsity-related energy/latency benefits without compromising on fault tolerance.

Related papers

Mitigating multiple single-event upsets during deep neural network inference using fault-aware training [0.0]
Deep neural networks (DNNs) are increasingly used in safety-critical applications. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a model. A fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware.
arXiv Detail & Related papers (2025-02-13T14:43:22Z)
Memory Faults in Activation-sparse Quantized Deep Neural Networks: Analysis and Mitigation using Sharpness-aware Training [0.0]
We investigate the impact of memory faults on activation-sparse quantized DNNs (AS QDNNs) We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults, with AS QDNNs exhibiting up to 11.13% lower accuracy than the standard QDNNs. We employ sharpness-aware quantization (SAQ) training to mitigate the impact of memory faults.
arXiv Detail & Related papers (2024-06-15T06:40:48Z)
Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification. Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z)
RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults [15.115813664357436]
RescueSNN is a novel methodology to mitigate permanent faults in the compute engine of SNN chips. RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate.
arXiv Detail & Related papers (2023-04-08T15:24:57Z)
Fault-Aware Design and Training to Enhance DNNs Reliability with Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements. Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically. In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z)
SoftSNN: Low-Cost Fault Tolerance for Spiking Neural Network Accelerators under Soft Errors [15.115813664357436]
SoftSNN is a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution. For a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively.
arXiv Detail & Related papers (2022-03-10T18:20:28Z)
HIRE-SNN: Harnessing the Inherent Robustness of Energy-Efficient Deep Spiking Neural Networks by Training with Crafted Input Noise [13.904091056365765]
We present an SNN training algorithm that uses crafted input noise and incurs no additional training time. Compared to standard trained direct input SNNs, our trained models yield improved classification accuracy of up to 13.7%. Our models also outperform inherently robust SNNs trained on rate-coded inputs with improved or similar classification performance on attack-generated images.
arXiv Detail & Related papers (2021-10-06T16:48:48Z)
Towards Low-Latency Energy-Efficient Deep SNNs via Attention-Guided Compression [12.37129078618206]
Deep spiking neural networks (SNNs) have emerged as a potential alternative to traditional deep learning frameworks. Most SNN training frameworks yield large inference latency which translates to increased spike activity and reduced energy efficiency. This paper presents a non-iterative SNN training technique thatachieves ultra-high compression with reduced spiking activity.
arXiv Detail & Related papers (2021-07-16T18:23:36Z)
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z)
S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-bit Neural Networks via Guided Distribution Calibration [74.5509794733707]
We present a novel guided learning paradigm from real-valued to distill binary networks on the final prediction distribution. Our proposed method can boost the simple contrastive learning baseline by an absolute gain of 5.515% on BNNs. Our method achieves substantial improvement over the simple contrastive learning baseline, and is even comparable to many mainstream supervised BNN methods.
arXiv Detail & Related papers (2021-02-17T18:59:28Z)
Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors. This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)
Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters. Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques. We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)

This list is automatically generated from the titles and abstracts of the papers in this site.