Related papers: Fault-Aware Design and Training to Enhance DNNs Reliability with Zero-Overhead

Fault-Aware Design and Training to Enhance DNNs Reliability with Zero-Overhead

URL: http://arxiv.org/abs/2205.14420v1
Date: Sat, 28 May 2022 13:09:30 GMT
Title: Fault-Aware Design and Training to Enhance DNNs Reliability with Zero-Overhead
Authors: Niccol\`o Cavagnero, Fernando Dos Santos, Marco Ciccone, Giuseppe Averta, Tatiana Tommasi, Paolo Rech
Abstract summary: Deep Neural Networks (DNNs) enable a wide series of technological advancements. Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically. In this work, we propose to tackle the reliability issue both at training and model design time.
Score: 67.87678914831477
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep Neural Networks (DNNs) enable a wide series of technological advancements, ranging from clinical imaging, to predictive industrial maintenance and autonomous driving. However, recent findings indicate that transient hardware faults may corrupt the models prediction dramatically. For instance, the radiation-induced misprediction probability can be so high to impede a safe deployment of DNNs models at scale, urging the need for efficient and effective hardening solutions. In this work, we propose to tackle the reliability issue both at training and model design time. First, we show that vanilla models are highly affected by transient faults, that can induce a performances drop up to 37%. Hence, we provide three zero-overhead solutions, based on DNN re-design and re-train, that can improve DNNs reliability to transient faults up to one order of magnitude. We complement our work with extensive ablation studies to quantify the gain in performances of each hardening component.

Related papers

Mitigating multiple single-event upsets during deep neural network inference using fault-aware training [0.0]
Deep neural networks (DNNs) are increasingly used in safety-critical applications. This study analyses the impact of multiple single-bit single-event upsets in DNNs by performing fault injection at the level of a model. A fault aware training (FAT) methodology is proposed that improves the DNNs' robustness to faults without any modification to the hardware.
arXiv Detail & Related papers (2025-02-13T14:43:22Z)
Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness [47.9744734181236]
We explore the concept of Lipschitz continuity to certify the robustness of deep neural networks (DNNs) against adversarial attacks. We propose a novel algorithm that remaps the input domain into a constrained range, reducing the Lipschitz constant and potentially enhancing robustness. Our method achieves the best robust accuracy for CIFAR10, CIFAR100, and ImageNet datasets on the RobustBench leaderboard.
arXiv Detail & Related papers (2024-06-28T03:10:36Z)
Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators. We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN. We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z)
RescueSNN: Enabling Reliable Executions on Spiking Neural Network Accelerators under Permanent Faults [15.115813664357436]
RescueSNN is a novel methodology to mitigate permanent faults in the compute engine of SNN chips. RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate.
arXiv Detail & Related papers (2023-04-08T15:24:57Z)
Uncertainty-aware deep learning for digital twin-driven monitoring: Application to fault detection in power lines [0.0]
Deep neural networks (DNNs) are often coupled with physics-based models or data-driven surrogate models to perform fault detection and health monitoring of systems in the low data regime. These models can exhibit parametric uncertainty that propagates to the generated data. In this article, we quantify the impact of both these sources of uncertainty on the performance of the DNN.
arXiv Detail & Related papers (2023-03-20T09:27:58Z)
Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness [172.61581010141978]
Certifiable robustness is a desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios. We propose a novel solution to strategically manipulate neurons, by "grafting" appropriate levels of linearity.
arXiv Detail & Related papers (2022-06-15T22:42:29Z)
Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware. It is a challenge to efficiently train SNNs due to their non-differentiability. We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z)
Enhanced physics-constrained deep neural networks for modeling vanadium redox flow battery [62.997667081978825]
We propose an enhanced version of the physics-constrained deep neural network (PCDNN) approach to provide high-accuracy voltage predictions. The ePCDNN can accurately capture the voltage response throughout the charge--discharge cycle, including the tail region of the voltage discharge curve.
arXiv Detail & Related papers (2022-03-03T19:56:24Z)
FitAct: Error Resilient Deep Neural Networks via Fine-Grained Post-Trainable Activation Functions [0.05249805590164901]
Deep neural networks (DNNs) are increasingly being deployed in safety-critical systems such as personal healthcare devices and self-driving cars. In this paper, we propose FitAct, a low-cost approach to enhance the error resilience of DNNs by deploying fine-grained post-trainable activation functions.
arXiv Detail & Related papers (2021-12-27T07:07:50Z)
GOAT: GPU Outsourcing of Deep Learning Training With Asynchronous Probabilistic Integrity Verification Inside Trusted Execution Environment [0.0]
Machine learning models based on Deep Neural Networks (DNNs) are increasingly deployed in a range of applications ranging from self-driving cars to COVID-19 treatment discovery. To support the computational power necessary to learn a DNN, cloud environments with dedicated hardware support have emerged as critical infrastructure. Various approaches have been developed to address these challenges, building on trusted execution environments (TEE)
arXiv Detail & Related papers (2020-10-17T20:09:05Z)
Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images) This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.