Mitigating multiple single-event upsets during deep neural network inference using fault-aware training
- URL: http://arxiv.org/abs/2502.09374v1
- Date: Thu, 13 Feb 2025 14:43:22 GMT
- Title: Mitigating multiple single-event upsets during deep neural network inference using fault-aware training
- Authors: Toon Vinck, Naïn Jonckers, Gert Dekkers, Jeffrey Prinzie, Peter Karsmakers,
- Abstract summary: Deep neural networks (DNNs) are increasingly used in safety-critical applications.
This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a model.
A fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware.
- Score: 0.0
- License:
- Abstract: Deep neural networks (DNNs) are increasingly used in safety-critical applications. Reliable fault analysis and mitigation are essential to ensure their functionality in harsh environments that contain high radiation levels. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a DNN model. Additionally, a fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware. Experimental results show that the FAT methodology improves the tolerance to faults up to a factor 3.
Related papers
- BDefects4NN: A Backdoor Defect Database for Controlled Localization Studies in Neural Networks [65.666913051617]
We introduce BDefects4NN, the first backdoor defect database for localization studies.
BDefects4NN provides labeled backdoor-defected DNNs at the neuron granularity and enables controlled localization studies of defect root causes.
We conduct experiments on evaluating six fault localization criteria and two defect repair techniques, which show limited effectiveness for backdoor defects.
arXiv Detail & Related papers (2024-12-01T09:52:48Z) - DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications.
BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks.
We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z) - Enhancing Fault Resilience of QNNs by Selective Neuron Splitting [1.1091582432763736]
Quantized Neural Networks (QNNs) have emerged to tackle the complexity of Deep Neural Networks (DNNs)
In this paper, a recent analytical resilience assessment method is adapted for QNNs to identify critical neurons based on a Neuron Vulnerability Factor (NVF)
A novel method for splitting the critical neurons is proposed that enables the design of a Lightweight Correction Unit (LCU) in the accelerator without redesigning its computational part.
arXiv Detail & Related papers (2023-06-16T17:11:55Z) - RescueSNN: Enabling Reliable Executions on Spiking Neural Network
Accelerators under Permanent Faults [15.115813664357436]
RescueSNN is a novel methodology to mitigate permanent faults in the compute engine of SNN chips.
RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate.
arXiv Detail & Related papers (2023-04-08T15:24:57Z) - FlatENN: Train Flat for Enhanced Fault Tolerance of Quantized Deep
Neural Networks [0.03807314298073299]
We investigate the impact of bit-flip and stuck-at faults on activation-sparse quantized DNNs (QDNNs)
We show that a high level of activation sparsity comes at the cost of larger vulnerability to faults.
We propose the mitigation of the impact of faults by employing a sharpness-aware quantization scheme.
arXiv Detail & Related papers (2022-12-29T06:06:14Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - FitAct: Error Resilient Deep Neural Networks via Fine-Grained
Post-Trainable Activation Functions [0.05249805590164901]
Deep neural networks (DNNs) are increasingly being deployed in safety-critical systems such as personal healthcare devices and self-driving cars.
In this paper, we propose FitAct, a low-cost approach to enhance the error resilience of DNNs by deploying fine-grained post-trainable activation functions.
arXiv Detail & Related papers (2021-12-27T07:07:50Z) - Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models.
Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z) - A Biased Graph Neural Network Sampler with Near-Optimal Regret [57.70126763759996]
Graph neural networks (GNN) have emerged as a vehicle for applying deep network architectures to graph and relational data.
In this paper, we build upon existing work and treat GNN neighbor sampling as a multi-armed bandit problem.
We introduce a newly-designed reward function that introduces some degree of bias designed to reduce variance and avoid unstable, possibly-unbounded payouts.
arXiv Detail & Related papers (2021-03-01T15:55:58Z) - GraN: An Efficient Gradient-Norm Based Detector for Adversarial and
Misclassified Examples [77.99182201815763]
Deep neural networks (DNNs) are vulnerable to adversarial examples and other data perturbations.
GraN is a time- and parameter-efficient method that is easily adaptable to any DNN.
GraN achieves state-of-the-art performance on numerous problem set-ups.
arXiv Detail & Related papers (2020-04-20T10:09:27Z) - A Low-cost Fault Corrector for Deep Neural Networks through Range
Restriction [1.8907108368038215]
Deep neural networks (DNNs) in safety-critical domains have engendered serious reliability concerns.
This work proposes Ranger, a low-cost fault corrector, which directly rectifies the faulty output due to transient faults without re-computation.
arXiv Detail & Related papers (2020-03-30T23:53:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.