Exploration of Activation Fault Reliability in Quantized Systolic
Array-Based DNN Accelerators
- URL: http://arxiv.org/abs/2401.09509v1
- Date: Wed, 17 Jan 2024 12:55:17 GMT
- Title: Exploration of Activation Fault Reliability in Quantized Systolic
Array-Based DNN Accelerators
- Authors: Mahdi Taheri, Natalia Cherezova, Mohammad Saeed Ansari, Maksim
Jenihhin, Ali Mahani, Masoud Daneshtalab, Jaan Raik
- Abstract summary: This paper presents a comprehensive methodology for exploring and enabling a holistic assessment of the impact of quantization on model accuracy, activation fault reliability, and hardware efficiency.
A fully automated framework is introduced that is capable of applying various quantization-aware techniques, fault injection, and hardware implementation.
The experiments on established benchmarks demonstrate the analysis flow and the profound implications of quantization on reliability, hardware performance, and network accuracy.
- Score: 0.8796261172196743
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The stringent requirements for the Deep Neural Networks (DNNs) accelerator's
reliability stand along with the need for reducing the computational burden on
the hardware platforms, i.e. reducing the energy consumption and execution time
as well as increasing the efficiency of DNN accelerators. Moreover, the growing
demand for specialized DNN accelerators with tailored requirements,
particularly for safety-critical applications, necessitates a comprehensive
design space exploration to enable the development of efficient and robust
accelerators that meet those requirements. Therefore, the trade-off between
hardware performance, i.e. area and delay, and the reliability of the DNN
accelerator implementation becomes critical and requires tools for analysis.
This paper presents a comprehensive methodology for exploring and enabling a
holistic assessment of the trilateral impact of quantization on model accuracy,
activation fault reliability, and hardware efficiency. A fully automated
framework is introduced that is capable of applying various quantization-aware
techniques, fault injection, and hardware implementation, thus enabling the
measurement of hardware parameters. Moreover, this paper proposes a novel
lightweight protection technique integrated within the framework to ensure the
dependable deployment of the final systolic-array-based FPGA implementation.
The experiments on established benchmarks demonstrate the analysis flow and the
profound implications of quantization on reliability, hardware performance, and
network accuracy, particularly concerning the transient faults in the network's
activations.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Enhancing Dropout-based Bayesian Neural Networks with Multi-Exit on FPGA [20.629635991749808]
This paper proposes an algorithm and hardware co-design framework that can generate field-programmable gate array (FPGA)-based accelerators for efficient BayesNNs.
At the algorithm level, we propose novel multi-exit dropout-based BayesNNs with reduced computational and memory overheads.
At the hardware level, this paper introduces a transformation framework that can generate FPGA-based accelerators for the proposed efficient BayesNNs.
arXiv Detail & Related papers (2024-06-20T17:08:42Z) - PNAS-MOT: Multi-Modal Object Tracking with Pareto Neural Architecture Search [64.28335667655129]
Multiple object tracking is a critical task in autonomous driving.
As tracking accuracy improves, neural networks become increasingly complex, posing challenges for their practical application in real driving scenarios due to the high level of latency.
In this paper, we explore the use of the neural architecture search (NAS) methods to search for efficient architectures for tracking, aiming for low real-time latency while maintaining relatively high accuracy.
arXiv Detail & Related papers (2024-03-23T04:18:49Z) - SAFFIRA: a Framework for Assessing the Reliability of
Systolic-Array-Based DNN Accelerators [0.4391603054571586]
This paper introduces a novel hierarchical software-based hardware-aware fault injection strategy tailored for systolic array-based Deep Neural Network (DNN) accelerators.
arXiv Detail & Related papers (2024-03-05T13:17:09Z) - Scaling #DNN-Verification Tools with Efficient Bound Propagation and
Parallel Computing [57.49021927832259]
Deep Neural Networks (DNNs) are powerful tools that have shown extraordinary results in many scenarios.
However, their intricate designs and lack of transparency raise safety concerns when applied in real-world applications.
Formal Verification (FV) of DNNs has emerged as a valuable solution to provide provable guarantees on the safety aspect.
arXiv Detail & Related papers (2023-12-10T13:51:25Z) - On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks [52.97107229149988]
We propose an On-Chip Hardware-Aware Quantization framework, performing hardware-aware mixed-precision quantization on deployed edge devices.
For efficiency metrics, we built an On-Chip Quantization Aware pipeline, which allows the quantization process to perceive the actual hardware efficiency of the quantization operator.
For accuracy metrics, we propose Mask-Guided Quantization Estimation technology to effectively estimate the accuracy impact of operators in the on-chip scenario.
arXiv Detail & Related papers (2023-09-05T04:39:34Z) - Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators.
We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN.
We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z) - DeepAxe: A Framework for Exploration of Approximation and Reliability
Trade-offs in DNN Accelerators [0.9556128246747769]
The role of Deep Neural Networks (DNNs) in safety-critical applications is expanding.
DNNs experience massive growth in terms of computation power.
It raises the necessity of improving the reliability of DNN accelerators.
arXiv Detail & Related papers (2023-03-14T20:42:38Z) - Quantization-aware Interval Bound Propagation for Training Certifiably
Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs)
Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization.
We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - High-Performance FPGA-based Accelerator for Bayesian Neural Networks [5.86877988129171]
This work proposes a novel FPGA-based hardware architecture to accelerate BNNs inferred through Monte Carlo Dropout.
Compared with other state-of-the-art BNN accelerators, the proposed accelerator can achieve up to 4 times higher energy efficiency and 9 times better compute efficiency.
arXiv Detail & Related papers (2021-05-12T06:20:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.