Related papers: HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search

HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search

URL: http://arxiv.org/abs/2103.08325v1
Date: Thu, 11 Mar 2021 12:21:21 GMT
Title: HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search
Authors: Xiangzhong Luo, Di Liu, Shuo Huai, and Weichen Liu
Abstract summary: We present a novel hardware-aware neural architecture search (NAS) framework, namely HSCoNAS, to automate the design of deep neural networks (DNNs) To accomplish this goal, we first propose an effective hardware performance modeling method to approximate the runtime latency of DNNs on target hardware. We also propose two novel techniques, i.e., dynamic channel scaling to maximize the accuracy under the specified latency and progressive space shrinking to refine the search space towards target hardware.
Score: 6.522258468923919
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this paper, we present a novel multi-objective hardware-aware neural architecture search (NAS) framework, namely HSCoNAS, to automate the design of deep neural networks (DNNs) with high accuracy but low latency upon target hardware. To accomplish this goal, we first propose an effective hardware performance modeling method to approximate the runtime latency of DNNs on target hardware, which will be integrated into HSCoNAS to avoid the tedious on-device measurements. Besides, we propose two novel techniques, i.e., dynamic channel scaling to maximize the accuracy under the specified latency and progressive space shrinking to refine the search space towards target hardware as well as alleviate the search overheads. These two techniques jointly work to allow HSCoNAS to perform fine-grained and efficient explorations. Finally, an evolutionary algorithm (EA) is incorporated to conduct the architecture search. Extensive experiments on ImageNet are conducted upon diverse target hardware, i.e., GPU, CPU, and edge device to demonstrate the superiority of HSCoNAS over recent state-of-the-art approaches.

Related papers

Combining Neural Architecture Search and Automatic Code Optimization: A Survey [0.8796261172196743]
Two notable techniques are Hardware-aware Neural Architecture Search (HW-NAS) and Automatic Code Optimization (ACO) HW-NAS automatically designs accurate yet hardware-friendly neural networks, while ACO involves searching for the best compiler optimizations to apply on neural networks. This survey explores recent works that combine these two techniques within a single framework.
arXiv Detail & Related papers (2024-08-07T22:40:05Z)
Multi-objective Differentiable Neural Architecture Search [58.67218773054753]
We propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics. Our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets.
arXiv Detail & Related papers (2024-02-28T10:09:04Z)
Hardware Aware Evolutionary Neural Architecture Search using Representation Similarity Metric [12.52012450501367]
Hardware-aware Neural Architecture Search (HW-NAS) is a technique used to automatically design the architecture of a neural network for a specific task and target hardware. evaluating the performance of candidate architectures is a key challenge in HW-NAS, as it requires significant computational resources. We propose an efficient hardware-aware evolution-based NAS approach called HW-EvRSNAS.
arXiv Detail & Related papers (2023-11-07T11:58:40Z)
MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process. We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
Algorithm and Hardware Co-design for Reconfigurable CNN Accelerator [3.1431240233552007]
Recent advances in algorithm-hardware co-design for deep neural networks (DNNs) have demonstrated their potential in automatically designing neural architectures and hardware designs. However, it is still a challenging optimization problem due to the expensive training cost and the time-consuming hardware implementation. We propose a novel three-phase co-design framework, with the following new features. Our found network and hardware configuration can achieve 2% 6% higher accuracy, 2x 26x smaller latency and 8.5x higher energy efficiency.
arXiv Detail & Related papers (2021-11-24T20:37:50Z)
NAS-FCOS: Efficient Search for Object Detection Architectures [113.47766862146389]
We propose an efficient method to obtain better object detectors by searching for the feature pyramid network (FPN) and the prediction head of a simple anchor-free object detector. With carefully designed search space, search algorithms, and strategies for evaluating network quality, we are able to find top-performing detection architectures within 4 days using 8 V100 GPUs.
arXiv Detail & Related papers (2021-10-24T12:20:04Z)
ISyNet: Convolutional Neural Networks design for AI accelerator [0.0]
Current state-of-the-art architectures are found with neural architecture search (NAS) taking model complexity into account. We propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method. We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.
arXiv Detail & Related papers (2021-09-04T20:57:05Z)
FLASH: Fast Neural Architecture Search with Hardware Optimization [7.263481020106725]
Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs) This paper proposes FLASH, a very fast NAS methodology that co-optimizes the DNN accuracy and performance on a real hardware platform.
arXiv Detail & Related papers (2021-08-01T23:46:48Z)
MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS) We employ a one-shot architecture search approach in order to obtain a reduced search cost. We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
PatDNN: Achieving Real-Time DNN Execution on Mobile Devices with Pattern-based Weight Pruning [57.20262984116752]
We introduce a new dimension, fine-grained pruning patterns inside the coarse-grained structures, revealing a previously unknown point in design space. With the higher accuracy enabled by fine-grained pruning patterns, the unique insight is to use the compiler to re-gain and guarantee high hardware efficiency.
arXiv Detail & Related papers (2020-01-01T04:52:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.