EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture
Search
- URL: http://arxiv.org/abs/2111.12299v1
- Date: Wed, 24 Nov 2021 06:45:30 GMT
- Title: EH-DNAS: End-to-End Hardware-aware Differentiable Neural Architecture
Search
- Authors: Qian Jiang, Xiaofan Zhang, Deming Chen, Minh N. Do, Raymond A. Yeh
- Abstract summary: We propose End-to-end Hardware-aware DNAS (EH-DNAS) to deliver hardware-efficient deep neural networks on various platforms.
EH-DNAS improves the hardware performance by an average of $1.4times$ on customized accelerators and $1.6times$ on existing hardware processors.
- Score: 32.23992012207146
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In hardware-aware Differentiable Neural Architecture Search (DNAS), it is
challenging to compute gradients of hardware metrics to perform architecture
search. Existing works rely on linear approximations with limited support to
customized hardware accelerators. In this work, we propose End-to-end
Hardware-aware DNAS (EH-DNAS), a seamless integration of end-to-end hardware
benchmarking, and fully automated DNAS to deliver hardware-efficient deep
neural networks on various platforms, including Edge GPUs, Edge TPUs, Mobile
CPUs, and customized accelerators. Given a desired hardware platform, we
propose to learn a differentiable model predicting the end-to-end hardware
performance of neural network architectures for DNAS. We also introduce
E2E-Perf, an end-to-end hardware benchmarking tool for customized accelerators.
Experiments on CIFAR10 and ImageNet show that EH-DNAS improves the hardware
performance by an average of $1.4\times$ on customized accelerators and
$1.6\times$ on existing hardware processors while maintaining the
classification accuracy.
Related papers
- SCAN-Edge: Finding MobileNet-speed Hybrid Networks for Diverse Edge Devices via Hardware-Aware Evolutionary Search [10.48978386515933]
We propose a unified NAS framework that searches for self-attention, convolution, and activation to accommodate the wide variety of edge devices.
SCAN-Edge relies on a hardware-aware evolutionary algorithm that improves the quality of the search space to accelerate the sampling process.
arXiv Detail & Related papers (2024-08-27T20:39:09Z) - Multi-objective Differentiable Neural Architecture Search [58.67218773054753]
We propose a novel NAS algorithm that encodes user preferences for the trade-off between performance and hardware metrics.
Our method outperforms existing MOO NAS methods across a broad range of qualitatively different search spaces and datasets.
arXiv Detail & Related papers (2024-02-28T10:09:04Z) - Using the Abstract Computer Architecture Description Language to Model
AI Hardware Accelerators [77.89070422157178]
Manufacturers of AI-integrated products face a critical challenge: selecting an accelerator that aligns with their product's performance requirements.
The Abstract Computer Architecture Description Language (ACADL) is a concise formalization of computer architecture block diagrams.
In this paper, we demonstrate how to use the ACADL to model AI hardware accelerators, use their ACADL description to map DNNs onto them, and explain the timing simulation semantics to gather performance results.
arXiv Detail & Related papers (2024-01-30T19:27:16Z) - Search-time Efficient Device Constraints-Aware Neural Architecture
Search [6.527454079441765]
Deep learning techniques like computer vision and natural language processing can be computationally expensive and memory-intensive.
We automate the construction of task-specific deep learning architectures optimized for device constraints through Neural Architecture Search (NAS)
We present DCA-NAS, a principled method of fast neural network architecture search that incorporates edge-device constraints.
arXiv Detail & Related papers (2023-07-10T09:52:28Z) - MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process.
We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z) - MAPLE-Edge: A Runtime Latency Predictor for Edge Devices [80.01591186546793]
We propose MAPLE-Edge, an edge device-oriented extension of MAPLE, the state-of-the-art latency predictor for general purpose hardware.
Compared to MAPLE, MAPLE-Edge can describe the runtime and target device platform using a much smaller set of CPU performance counters.
We also demonstrate that unlike MAPLE which performs best when trained on a pool of devices sharing a common runtime, MAPLE-Edge can effectively generalize across runtimes.
arXiv Detail & Related papers (2022-04-27T14:00:48Z) - FPGA-optimized Hardware acceleration for Spiking Neural Networks [69.49429223251178]
This work presents the development of a hardware accelerator for an SNN, with off-line training, applied to an image recognition task.
The design targets a Xilinx Artix-7 FPGA, using in total around the 40% of the available hardware resources.
It reduces the classification time by three orders of magnitude, with a small 4.5% impact on the accuracy, if compared to its software, full precision counterpart.
arXiv Detail & Related papers (2022-01-18T13:59:22Z) - MAPLE: Microprocessor A Priori for Latency Estimation [81.91509153539566]
Modern deep neural networks must demonstrate state-of-the-art accuracy while exhibiting low latency and energy consumption.
Measuring the latency of every evaluated architecture adds a significant amount of time to the NAS process.
We propose Microprocessor A Priori for Estimation Estimation MAPLE that does not rely on transfer learning or domain adaptation.
arXiv Detail & Related papers (2021-11-30T03:52:15Z) - HSCoNAS: Hardware-Software Co-Design of Efficient DNNs via Neural
Architecture Search [6.522258468923919]
We present a novel hardware-aware neural architecture search (NAS) framework, namely HSCoNAS, to automate the design of deep neural networks (DNNs)
To accomplish this goal, we first propose an effective hardware performance modeling method to approximate the runtime latency of DNNs on target hardware.
We also propose two novel techniques, i.e., dynamic channel scaling to maximize the accuracy under the specified latency and progressive space shrinking to refine the search space towards target hardware.
arXiv Detail & Related papers (2021-03-11T12:21:21Z) - Rethinking Co-design of Neural Architectures and Hardware Accelerators [31.342964958282092]
We systematically study the importance and strategies of co-designing neural architectures and hardware accelerators.
Our experiments show that the joint search method consistently outperforms previous platform-aware neural architecture search.
Our method can reduce energy consumption of an edge accelerator by up to 2x under the same accuracy constraint.
arXiv Detail & Related papers (2021-02-17T07:55:58Z) - DANCE: Differentiable Accelerator/Network Co-Exploration [8.540518473228078]
This work presents a differentiable approach towards the co-exploration of the hardware accelerator and network architecture design.
By modeling the hardware evaluation software with a neural network, the relation between the accelerator architecture and the hardware metrics becomes differentiable.
Compared to the naive existing approaches, our method performs co-exploration in a significantly shorter time, while achieving superior accuracy and hardware cost metrics.
arXiv Detail & Related papers (2020-09-14T07:43:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.