HCM: Hardware-Aware Complexity Metric for Neural Network Architectures
- URL: http://arxiv.org/abs/2004.08906v2
- Date: Sun, 26 Apr 2020 12:56:25 GMT
- Title: HCM: Hardware-Aware Complexity Metric for Neural Network Architectures
- Authors: Alex Karbachevsky, Chaim Baskin, Evgenii Zheltonozhskii, Yevgeny
Yermolin, Freddy Gabbay, Alex M. Bronstein, Avi Mendelson
- Abstract summary: This paper introduces a hardware-aware complexity metric that aims to assist the system designer of the neural network architectures.
We demonstrate how the proposed metric can help evaluate different design alternatives of neural network models on resource-restricted devices.
- Score: 6.556553154231475
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Convolutional Neural Networks (CNNs) have become common in many fields
including computer vision, speech recognition, and natural language processing.
Although CNN hardware accelerators are already included as part of many SoC
architectures, the task of achieving high accuracy on resource-restricted
devices is still considered challenging, mainly due to the vast number of
design parameters that need to be balanced to achieve an efficient solution.
Quantization techniques, when applied to the network parameters, lead to a
reduction of power and area and may also change the ratio between communication
and computation. As a result, some algorithmic solutions may suffer from lack
of memory bandwidth or computational resources and fail to achieve the expected
performance due to hardware constraints. Thus, the system designer and the
micro-architect need to understand at early development stages the impact of
their high-level decisions (e.g., the architecture of the CNN and the amount of
bits used to represent its parameters) on the final product (e.g., the expected
power saving, area, and accuracy). Unfortunately, existing tools fall short of
supporting such decisions.
This paper introduces a hardware-aware complexity metric that aims to assist
the system designer of the neural network architectures, through the entire
project lifetime (especially at its early stages) by predicting the impact of
architectural and micro-architectural decisions on the final product. We
demonstrate how the proposed metric can help evaluate different design
alternatives of neural network models on resource-restricted devices such as
real-time embedded systems, and to avoid making design mistakes at early
stages.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - Fault-tolerant resource estimation using graph-state compilation on a modular superconducting architecture [30.01663013636363]
Development of fault-tolerant quantum computers (FTQCs) is gaining increased attention within the quantum computing community.
We present a resource estimation framework and software tool that estimates the physical resources required to execute specific quantum algorithms.
This tool can predict the size, power consumption, and execution time of these algorithms at as they approach utility-scale.
arXiv Detail & Related papers (2024-06-10T04:30:48Z) - A Proper Orthogonal Decomposition approach for parameters reduction of
Single Shot Detector networks [0.0]
We propose a dimensionality reduction framework based on Proper Orthogonal Decomposition, a classical model order reduction technique.
We have applied such framework to SSD300 architecture using PASCAL VOC dataset, demonstrating a reduction of the network dimension and a remarkable speedup in the fine-tuning of the network in a transfer learning context.
arXiv Detail & Related papers (2022-07-27T14:43:14Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - ISyNet: Convolutional Neural Networks design for AI accelerator [0.0]
Current state-of-the-art architectures are found with neural architecture search (NAS) taking model complexity into account.
We propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method.
We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.
arXiv Detail & Related papers (2021-09-04T20:57:05Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - Automated Search for Resource-Efficient Branched Multi-Task Networks [81.48051635183916]
We propose a principled approach, rooted in differentiable neural architecture search, to automatically define branching structures in a multi-task neural network.
We show that our approach consistently finds high-performing branching structures within limited resource budgets.
arXiv Detail & Related papers (2020-08-24T09:49:19Z) - Multi-Objective Optimization for Size and Resilience of Spiking Neural
Networks [0.9449650062296823]
Neuromorphic computing architectures model Spiking Neural Networks (SNNs) in silicon.
We study Spiking Neural Networks in two neuromorphic architecture implementations with the goal of decreasing their size.
We propose a multiobjective fitness function to optimize the size and resiliency of the SNN.
arXiv Detail & Related papers (2020-02-04T16:58:25Z) - Lightweight Residual Densely Connected Convolutional Neural Network [18.310331378001397]
The lightweight residual densely connected blocks are proposed to guaranty the deep supervision, efficient gradient flow, and feature reuse abilities of convolutional neural network.
The proposed method decreases the cost of training and inference processes without using any special hardware-software equipment.
arXiv Detail & Related papers (2020-01-02T17:15:32Z) - PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with
Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space.
With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.