Related papers: RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults

RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults

URL: http://arxiv.org/abs/2304.04041v1
Date: Sat, 8 Apr 2023 15:24:57 GMT
Title: RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults
Authors: Rachmad Vidya Wicaksana Putra, Muhammad Abdullah Hanif, Muhammad Shafique
Abstract summary: RescueSNN is a novel methodology to mitigate permanent faults in the compute engine of SNN chips. RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate.
Score: 15.115813664357436
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: To maximize the performance and energy efficiency of Spiking Neural Network (SNN) processing on resource-constrained embedded systems, specialized hardware accelerators/chips are employed. However, these SNN chips may suffer from permanent faults which can affect the functionality of weight memory and neuron behavior, thereby causing potentially significant accuracy degradation and system malfunctioning. Such permanent faults may come from manufacturing defects during the fabrication process, and/or from device/transistor damages (e.g., due to wear out) during the run-time operation. However, the impact of permanent faults in SNN chips and the respective mitigation techniques have not been thoroughly investigated yet. Toward this, we propose RescueSNN, a novel methodology to mitigate permanent faults in the compute engine of SNN chips without requiring additional retraining, thereby significantly cutting down the design time and retraining costs, while maintaining the throughput and quality. The key ideas of our RescueSNN methodology are (1) analyzing the characteristics of SNN under permanent faults; (2) leveraging this analysis to improve the SNN fault-tolerance through effective fault-aware mapping (FAM); and (3) devising lightweight hardware enhancements to support FAM. Our FAM technique leverages the fault map of SNN compute engine for (i) minimizing weight corruption when mapping weight bits on the faulty memory cells, and (ii) selectively employing faulty neurons that do not cause significant accuracy degradation to maintain accuracy and throughput, while considering the SNN operations and processing dataflow. The experimental results show that our RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate (e.g., 0.5 of the potential fault locations), as compared to running SNNs on the faulty chip without mitigation.

Related papers

Scalable Mechanistic Neural Networks [52.28945097811129]
We propose an enhanced neural network framework designed for scientific machine learning applications involving long temporal sequences. By reformulating the original Mechanistic Neural Network (MNN) we reduce the computational time and space complexities from cubic and quadratic with respect to the sequence length, respectively, to linear. Extensive experiments demonstrate that S-MNN matches the original MNN in precision while substantially reducing computational resources.
arXiv Detail & Related papers (2024-10-08T14:27:28Z)
Towards Low-latency Event-based Visual Recognition with Hybrid Step-wise Distillation Spiking Neural Networks [50.32980443749865]
Spiking neural networks (SNNs) have garnered significant attention for their low power consumption and high biologicalability. Current SNNs struggle to balance accuracy and latency in neuromorphic datasets. We propose Step-wise Distillation (HSD) method, tailored for neuromorphic datasets.
arXiv Detail & Related papers (2024-09-19T06:52:34Z)
Harnessing Neuron Stability to Improve DNN Verification [42.65507402735545]
We present VeriStable, a novel extension of recently proposed DPLL-based constraint DNN verification approach. We evaluate the effectiveness of VeriStable across a range of challenging benchmarks including fully-connected feed networks (FNNs), convolutional neural networks (CNNs) and residual networks (ResNets) Preliminary results show that VeriStable is competitive and outperforms state-of-the-art verification tools, including $alpha$-$beta$-CROWN and MN-BaB, the first and second performers of the VNN-COMP, respectively.
arXiv Detail & Related papers (2024-01-19T23:48:04Z)
Improving Reliability of Spiking Neural Networks through Fault Aware Threshold Voltage Optimization [0.0]
Spiking neural networks (SNNs) have made breakthroughs in computer vision by lending themselves to neuromorphic hardware. Systolic-array SNN accelerators (systolicSNNs) have been proposed recently, but their reliability is still a major concern. We present a novel fault mitigation method, i.e., fault-aware threshold voltage optimization in retraining (FalVolt)
arXiv Detail & Related papers (2023-01-12T19:30:21Z)
FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep Neural Networks [0.03807314298073299]
We investigate the impact of bit-flip and stuck-at faults on activation-sparse quantized DNNs (QDNNs) We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults. We propose the mitigation of the impact of faults by employing a sharpness-aware quantization scheme.
arXiv Detail & Related papers (2022-12-29T06:06:14Z)
Fault-Aware Design and Training to Enhance DNNs Reliability with Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements. Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically. In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z)
SoftSNN: Low-Cost Fault Tolerance for Spiking Neural Network Accelerators under Soft Errors [15.115813664357436]
SoftSNN is a novel methodology to mitigate soft errors in the weight registers (synapses) and neurons of SNN accelerators without re-execution. For a 900-neuron network with even a high fault rate, our SoftSNN maintains the accuracy degradation below 3%, while reducing latency and energy by up to 3x and 2.3x respectively.
arXiv Detail & Related papers (2022-03-10T18:20:28Z)
ReSpawn: Energy-Efficient Fault-Tolerance for Spiking Neural Networks considering Unreliable Memories [14.933137030206286]
Spiking neural networks (SNNs) have shown a potential for having low energy with unsupervised learning capabilities. They may suffer from accuracy degradation if their processing is performed under the presence of hardware-induced faults in memories. We propose ReSpawn, a novel framework for mitigating the negative impacts of faults in both the off-chip and on-chip memories.
arXiv Detail & Related papers (2021-08-23T16:17:33Z)
Pruning of Deep Spiking Neural Networks through Gradient Rewiring [41.64961999525415]
Spiking Neural Networks (SNNs) have been attached great importance due to their biological plausibility and high energy-efficiency on neuromorphic chips. Most existing methods directly apply pruning approaches in artificial neural networks (ANNs) to SNNs, which ignore the difference between ANNs and SNNs. We propose gradient rewiring (Grad R), a joint learning algorithm of connectivity and weight for SNNs, that enables us to seamlessly optimize network structure without retrain.
arXiv Detail & Related papers (2021-05-11T10:05:53Z)
Random and Adversarial Bit Error Robustness: Energy-Efficient and Secure DNN Accelerators [105.60654479548356]
We show that a combination of robust fixed-point quantization, weight clipping, as well as random bit error training (RandBET) improves robustness against random or adversarial bit errors in quantized DNN weights significantly. This leads to high energy savings for low-voltage operation as well as low-precision quantization, but also improves security of DNN accelerators.
arXiv Detail & Related papers (2021-04-16T19:11:14Z)
Exploring Fault-Energy Trade-offs in Approximate DNN Hardware Accelerators [2.9649783577150837]
We present an extensive layer-wise and bit-wise fault resilience and energy analysis of different AxDNNs. Our results demonstrate that the fault resilience in AxDNNs is to the energy efficiency.
arXiv Detail & Related papers (2021-01-08T05:52:12Z)
Bit Error Robustness for Energy-Efficient DNN Accelerators [93.58572811484022]
We show that a combination of robust fixed-point quantization, weight clipping, and random bit error training (RandBET) improves robustness against random bit errors. This leads to high energy savings from both low-voltage operation as well as low-precision quantization.
arXiv Detail & Related papers (2020-06-24T18:23:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.