DeepAxe: A Framework for Exploration of Approximation and Reliability
Trade-offs in DNN Accelerators
- URL: http://arxiv.org/abs/2303.08226v1
- Date: Tue, 14 Mar 2023 20:42:38 GMT
- Title: DeepAxe: A Framework for Exploration of Approximation and Reliability
Trade-offs in DNN Accelerators
- Authors: Mahdi Taheri, Mohammad Riazati, Mohammad Hasan Ahmadilivani, Maksim
Jenihhin, Masoud Daneshtalab, Jaan Raik, Mikael Sjodin, and Bjorn Lisper
- Abstract summary: The role of Deep Neural Networks (DNNs) in safety-critical applications is expanding.
DNNs experience massive growth in terms of computation power.
It raises the necessity of improving the reliability of DNN accelerators.
- Score: 0.9556128246747769
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While the role of Deep Neural Networks (DNNs) in a wide range of
safety-critical applications is expanding, emerging DNNs experience massive
growth in terms of computation power. It raises the necessity of improving the
reliability of DNN accelerators yet reducing the computational burden on the
hardware platforms, i.e. reducing the energy consumption and execution time as
well as increasing the efficiency of DNN accelerators. Therefore, the trade-off
between hardware performance, i.e. area, power and delay, and the reliability
of the DNN accelerator implementation becomes critical and requires tools for
analysis. In this paper, we propose a framework DeepAxe for design space
exploration for FPGA-based implementation of DNNs by considering the trilateral
impact of applying functional approximation on accuracy, reliability and
hardware performance. The framework enables selective approximation of
reliability-critical DNNs, providing a set of Pareto-optimal DNN implementation
design space points for the target resource utilization requirements. The
design flow starts with a pre-trained network in Keras, uses an innovative
high-level synthesis environment DeepHLS and results in a set of Pareto-optimal
design space points as a guide for the designer. The framework is demonstrated
in a case study of custom and state-of-the-art DNNs and datasets.
Related papers
- DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing.
Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time.
We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z) - Exploration of Activation Fault Reliability in Quantized Systolic
Array-Based DNN Accelerators [0.8796261172196743]
This paper presents a comprehensive methodology for exploring and enabling a holistic assessment of the impact of quantization on model accuracy, activation fault reliability, and hardware efficiency.
A fully automated framework is introduced that is capable of applying various quantization-aware techniques, fault injection, and hardware implementation.
The experiments on established benchmarks demonstrate the analysis flow and the profound implications of quantization on reliability, hardware performance, and network accuracy.
arXiv Detail & Related papers (2024-01-17T12:55:17Z) - Hardware-Aware DNN Compression via Diverse Pruning and Mixed-Precision
Quantization [1.0235078178220354]
We propose an automated framework to compress Deep Neural Networks (DNNs) in a hardware-aware manner by jointly employing pruning and quantization.
Our framework achieves $39%$ average energy reduction for datasets $1.7%$ average accuracy loss and outperforms significantly the state-of-the-art approaches.
arXiv Detail & Related papers (2023-12-23T18:50:13Z) - Special Session: Approximation and Fault Resiliency of DNN Accelerators [0.9126382223122612]
This paper explores the approximation and fault resiliency of Deep Neural Network accelerators.
We propose to use approximate (AxC) arithmetic circuits to emulate errors in hardware without performing fault injection on the DNN.
We also propose a fine-grain analysis of fault resiliency by examining fault propagation and masking in networks.
arXiv Detail & Related papers (2023-05-31T19:27:45Z) - End-to-end codesign of Hessian-aware quantized neural networks for FPGAs
and ASICs [49.358119307844035]
We develop an end-to-end workflow for the training and implementation of co-designed neural networks (NNs)
This makes efficient NN implementations in hardware accessible to nonexperts, in a single open-sourced workflow.
We demonstrate the workflow in a particle physics application involving trigger decisions that must operate at the 40 MHz collision rate of the Large Hadron Collider (LHC)
We implement an optimized mixed-precision NN for high-momentum particle jets in simulated LHC proton-proton collisions.
arXiv Detail & Related papers (2023-04-13T18:00:01Z) - Fault-Aware Design and Training to Enhance DNNs Reliability with
Zero-Overhead [67.87678914831477]
Deep Neural Networks (DNNs) enable a wide series of technological advancements.
Recent findings indicate that transient hardware faults may corrupt the models prediction dramatically.
In this work, we propose to tackle the reliability issue both at training and model design time.
arXiv Detail & Related papers (2022-05-28T13:09:30Z) - Comparative Analysis of Interval Reachability for Robust Implicit and
Feedforward Neural Networks [64.23331120621118]
We use interval reachability analysis to obtain robustness guarantees for implicit neural networks (INNs)
INNs are a class of implicit learning models that use implicit equations as layers.
We show that our approach performs at least as well as, and generally better than, applying state-of-the-art interval bound propagation methods to INNs.
arXiv Detail & Related papers (2022-04-01T03:31:27Z) - Hardware Approximate Techniques for Deep Neural Network Accelerators: A
Survey [4.856755747052137]
Deep Neural Networks (DNNs) are very popular because of their high performance in various cognitive tasks in Machine Learning (ML)
Recent advancements in DNNs have brought beyond human accuracy in many tasks, but at the cost of high computational complexity.
This article provides a comprehensive survey and analysis of hardware approximation techniques for DNN accelerators.
arXiv Detail & Related papers (2022-03-16T16:33:13Z) - High-Performance FPGA-based Accelerator for Bayesian Neural Networks [5.86877988129171]
This work proposes a novel FPGA-based hardware architecture to accelerate BNNs inferred through Monte Carlo Dropout.
Compared with other state-of-the-art BNN accelerators, the proposed accelerator can achieve up to 4 times higher energy efficiency and 9 times better compute efficiency.
arXiv Detail & Related papers (2021-05-12T06:20:44Z) - Learning to Solve the AC-OPF using Sensitivity-Informed Deep Neural
Networks [52.32646357164739]
We propose a deep neural network (DNN) to solve the solutions of the optimal power flow (ACOPF)
The proposed SIDNN is compatible with a broad range of OPF schemes.
It can be seamlessly integrated in other learning-to-OPF schemes.
arXiv Detail & Related papers (2021-03-27T00:45:23Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.