Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead
- URL: http://arxiv.org/abs/2205.14420v1
- Date: Sat, 28 May 2022 13:09:30 GMT
- Title: Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead
- Authors: Niccol\`o Cavagnero, Fernando Dos Santos, Marco Ciccone, Giuseppe
Averta, Tatiana Tommasi, Paolo Rech
- Abstract summary: Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
- Score: 67.87678914831477
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks (DNNs) enable a wide series of technological
advancements, ranging from clinical imaging, to predictive industrial
maintenance and autonomous driving. However, recent findings indicate that
transient hardware faults may corrupt the models prediction dramatically. For
instance, the radiation-induced misprediction probability can be so high to
impede a safe deployment of DNNs models at scale, urging the need for efficient
and effective hardening solutions. In this work, we propose to tackle the
reliability issue both at training and model design time. First, we show that
vanilla models are highly affected by transient faults, that can induce a
performances drop up to 37%. Hence, we provide three zero-overhead solutions,
based on DNN re-design and re-train, that can improve DNNs reliability to
transient faults up to one order of magnitude. We complement our work with
extensive ablation studies to quantify the gain in performances of each
hardening component.
Related papers
- Data-Driven Lipschitz Continuity: A Cost-Effective Approach to Improve Adversarial Robustness [47.9744734181236]
We explore the concept of Lipschitz continuity to certify the robustness of deep neural networks (DNNs) against adversarial attacks.
We propose a novel algorithm that remaps the input domain into a constrained range, reducing the Lipschitz constant and potentially enhancing robustness.
Our method achieves the best robust accuracy for CIFAR10, CIFAR100, and ImageNet datasets on the RobustBench leaderboard.
arXiv Detail & Related papers (2024-06-28T03:10:36Z) - Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators.
We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN.
We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z) - RescueSNN: Enabling Reliable Executions on Spiking Neural Network
Accelerators under Permanent Faults [15.115813664357436]
RescueSNN is a novel methodology to mitigate permanent faults in the compute engine of SNN chips.
RescueSNN improves accuracy by up to 80% while maintaining the throughput reduction below 25% in high fault rate.
arXiv Detail & Related papers (2023-04-08T15:24:57Z) - Uncertainty-aware deep learning for digital twin-driven monitoring:
Application to fault detection in power lines [0.0]
Deep neural networks (DNNs) are often coupled with physics-based models or data-driven surrogate models to perform fault detection and health monitoring of systems in the low data regime.
These models can exhibit parametric uncertainty that propagates to the generated data.
In this article, we quantify the impact of both these sources of uncertainty on the performance of the DNN.
arXiv Detail & Related papers (2023-03-20T09:27:58Z) - Linearity Grafting: Relaxed Neuron Pruning Helps Certifiable Robustness [172.61581010141978]
Certifiable robustness is a desirable property for adopting deep neural networks (DNNs) in safety-critical scenarios.
We propose a novel solution to strategically manipulate neurons, by "grafting" appropriate levels of linearity.
arXiv Detail & Related papers (2022-06-15T22:42:29Z) - Training High-Performance Low-Latency Spiking Neural Networks by
Differentiation on Spike Representation [70.75043144299168]
Spiking Neural Network (SNN) is a promising energy-efficient AI model when implemented on neuromorphic hardware.
It is a challenge to efficiently train SNNs due to their non-differentiability.
We propose the Differentiation on Spike Representation (DSR) method, which could achieve high performance.
arXiv Detail & Related papers (2022-05-01T12:44:49Z) - Enhanced physics-constrained deep neural networks for modeling vanadium
redox flow battery [62.997667081978825]
We propose an enhanced version of the physics-constrained deep neural network (PCDNN) approach to provide high-accuracy voltage predictions.
The ePCDNN can accurately capture the voltage response throughout the charge--discharge cycle, including the tail region of the voltage discharge curve.
arXiv Detail & Related papers (2022-03-03T19:56:24Z) - FitAct: Error Resilient Deep Neural Networks via Fine-Grained
Post-Trainable Activation Functions [0.05249805590164901]
Deep neural networks (DNNs) are increasingly being deployed in safety-critical systems such as personal healthcare devices and self-driving cars.
In this paper, we propose FitAct, a low-cost approach to enhance the error resilience of DNNs by deploying fine-grained post-trainable activation functions.
arXiv Detail & Related papers (2021-12-27T07:07:50Z) - GOAT: GPU Outsourcing of Deep Learning Training With Asynchronous
Probabilistic Integrity Verification Inside Trusted Execution Environment [0.0]
Machine learning models based on Deep Neural Networks (DNNs) are increasingly deployed in a range of applications ranging from self-driving cars to COVID-19 treatment discovery.
To support the computational power necessary to learn a DNN, cloud environments with dedicated hardware support have emerged as critical infrastructure.
Various approaches have been developed to address these challenges, building on trusted execution environments (TEE)
arXiv Detail & Related papers (2020-10-17T20:09:05Z) - Triple Wins: Boosting Accuracy, Robustness and Efficiency Together by
Enabling Input-Adaptive Inference [119.19779637025444]
Deep networks were recently suggested to face the odds between accuracy (on clean natural images) and robustness (on adversarially perturbed images)
This paper studies multi-exit networks associated with input-adaptive inference, showing their strong promise in achieving a "sweet point" in cooptimizing model accuracy, robustness and efficiency.
arXiv Detail & Related papers (2020-02-24T00:40:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.