Related papers: EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search

EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search

URL: http://arxiv.org/abs/2111.12299v1
Date: Wed, 24 Nov 2021 06:45:30 GMT
Title: EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture Search
Authors: Qian Jiang, Xiaofan Zhang, Deming Chen, Minh N. Do, Raymond A. Yeh
Abstract summary: We propose End-to-end Hardware-aware DNAS (EH-DNAS) to deliver hardware-efficient deep neural networks on various platforms. EH-DNAS improves the hardware performance by an average of $1.4times$ on customized accelerators and $1.6times$ on existing hardware processors.
Score: 32.23992012207146
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In hardware-aware Differentiable Neural Architecture Search (DNAS), it is challenging to compute gradients of hardware metrics to perform architecture search. Existing works rely on linear approximations with limited support to customized hardware accelerators. In this work, we propose End-to-end Hardware-aware DNAS (EH-DNAS), a seamless integration of end-to-end hardware benchmarking, and fully automated DNAS to deliver hardware-efficient deep neural networks on various platforms, including Edge GPUs, Edge TPUs, Mobile CPUs, and customized accelerators. Given a desired hardware platform, we propose to learn a differentiable model predicting the end-to-end hardware performance of neural network architectures for DNAS. We also introduce E2E-Perf, an end-to-end hardware benchmarking tool for customized accelerators. Experiments on CIFAR10 and ImageNet show that EH-DNAS improves the hardware performance by an average of $1.4\times$ on customized accelerators and $1.6\times$ on existing hardware processors while maintaining the classification accuracy.

Related papers

SCAN-Edge: Finding MobileNet-speed Hybrid Networks for Diverse Edge Devices via Hardware-Aware Evolutionary Search [10.48978386515933]
We propose a unified NAS framework that searches for self-attention, convolution, and activation to accommodate the wide variety of edge devices. SCAN-Edge relies on a hardware-aware evolutionary algorithm that improves the quality of the search space to accelerate the sampling process.
arXiv Detail & Related papers (2024-08-27T20:39:09Z)
Multi-objective Differentiable Neural Architecture Search [58.67218773054753]
We propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics. Our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets.
arXiv Detail & Related papers (2024-02-28T10:09:04Z)
Using the Abstract Computer Architecture Description Language to Model AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements. The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams. In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z)
Search-time Efficient Device Constraints-Aware Neural Architecture Search [6.527454079441765]
Deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive. We automate the construction of task-specific deep learning architectures optimized for device constraints through Neural Architecture Search (NAS) We present DCA-NAS, a principled method of fast neural network architecture search that incorporates edge-device constraints.
arXiv Detail & Related papers (2023-07-10T09:52:28Z)
MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process. We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z)
MAPLE-Edge: A Runtime Latency Predictor for Edge Devices [80.01591186546793]
We propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware. Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU performance counters. We also demonstrate that unlike MAPLE which performs best when trained on a pool of devices sharing a common runtime, MAPLE-Edge can effectively generalize across runtimes.
arXiv Detail & Related papers (2022-04-27T14:00:48Z)
FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task. The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources. It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z)
MAPLE: Microprocessor A Priori for Latency Estimation [81.91509153539566]
Modern deep neural networks must demonstrate state-of-the-art accuracy while exhibiting low latency and energy consumption. Measuring the latency of every evaluated architecture adds a significant amount of time to the NAS process. We propose Microprocessor A Priori for Estimation Estimation MAPLE that does not rely on transfer learning or domain adaptation.
arXiv Detail & Related papers (2021-11-30T03:52:15Z)
HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural Architecture Search [6.522258468923919]
We present a novel hardware-aware neural architecture search (NAS) framework, namely HSCoNAS, to automate the design of deep neural networks (DNNs) To accomplish this goal, we first propose an effective hardware performance modeling method to approximate the runtime latency of DNNs on target hardware. We also propose two novel techniques, i.e., dynamic channel scaling to maximize the accuracy under the specified latency and progressive space shrinking to refine the search space towards target hardware.
arXiv Detail & Related papers (2021-03-11T12:21:21Z)
Rethinking Co-design of Neural Architectures and Hardware Accelerators [31.342964958282092]
We systematically study the importance and strategies of co-designing neural architectures and hardware accelerators. Our experiments show that the joint search method consistently outperforms previous platform-aware neural architecture search. Our method can reduce energy consumption of an edge accelerator by up to 2x under the same accuracy constraint.
arXiv Detail & Related papers (2021-02-17T07:55:58Z)
DANCE: Differentiable Accelerator/Network Co-Exploration [8.540518473228078]
This work presents a differentiable approach towards the co-exploration of the hardware accelerator and network architecture design. By modeling the hardware evaluation software with a neural network, the relation between the accelerator architecture and the hardware metrics becomes differentiable. Compared to the naive existing approaches, our method performs co-exploration in a significantly shorter time, while achieving superior accuracy and hardware cost metrics.
arXiv Detail & Related papers (2020-09-14T07:43:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.