U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture
Search
- URL: http://arxiv.org/abs/2203.12412v1
- Date: Wed, 23 Mar 2022 13:44:15 GMT
- Title: U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture
Search
- Authors: Ahmet Caner Y\"uz\"ug\"uler, Nikolaos Dimitriadis, Pascal Frossard
- Abstract summary: optimizing resource utilization in target platforms is key to achieving high performance during DNN inference.
We propose a novel hardware-aware NAS framework that does not only optimize for task accuracy and inference latency, but also for resource utilization.
We achieve 2.8 - 4x speedup for DNN inference compared to prior hardware-aware NAS methods.
- Score: 50.33956216274694
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Optimizing resource utilization in target platforms is key to achieving high
performance during DNN inference. While optimizations have been proposed for
inference latency, memory footprint, and energy consumption, prior
hardware-aware neural architecture search (NAS) methods have omitted resource
utilization, preventing DNNs to take full advantage of the target inference
platforms. Modeling resource utilization efficiently and accurately is
challenging, especially for widely-used array-based inference accelerators such
as Google TPU. In this work, we propose a novel hardware-aware NAS framework
that does not only optimize for task accuracy and inference latency, but also
for resource utilization. We also propose and validate a new computational
model for resource utilization in inference accelerators. By using the proposed
NAS framework and the proposed resource utilization model, we achieve 2.8 - 4x
speedup for DNN inference compared to prior hardware-aware NAS methods while
attaining similar or improved accuracy in image classification on CIFAR-10 and
Imagenet-100 datasets.
Related papers
- DCP-NAS: Discrepant Child-Parent Neural Architecture Search for 1-bit
CNNs [53.82853297675979]
1-bit convolutional neural networks (CNNs) with binary weights and activations show their potential for resource-limited embedded devices.
One natural approach is to use 1-bit CNNs to reduce the computation and memory cost of NAS.
We introduce Discrepant Child-Parent Neural Architecture Search (DCP-NAS) to efficiently search 1-bit CNNs.
arXiv Detail & Related papers (2023-06-27T11:28:29Z) - DiffusionNAG: Predictor-guided Neural Architecture Generation with Diffusion Models [56.584561770857306]
We propose a novel conditional Neural Architecture Generation (NAG) framework based on diffusion models, dubbed DiffusionNAG.
Specifically, we consider the neural architectures as directed graphs and propose a graph diffusion model for generating them.
We validate the effectiveness of DiffusionNAG through extensive experiments in two predictor-based NAS scenarios: Transferable NAS and Bayesian Optimization (BO)-based NAS.
When integrated into a BO-based algorithm, DiffusionNAG outperforms existing BO-based NAS approaches, particularly in the large MobileNetV3 search space on the ImageNet 1K dataset.
arXiv Detail & Related papers (2023-05-26T13:58:18Z) - Data Aware Neural Architecture Search [0.12891210250935145]
In Machine Learning, one single metric is not enough to evaluate a NN architecture.
Recent works on NAS for resource constrained systems have investigated various approaches to optimize for multiple metrics.
We name such a system "Data Aware NAS", and we provide experimental evidence of its benefits.
arXiv Detail & Related papers (2023-04-04T14:20:36Z) - Lightweight Neural Architecture Search for Temporal Convolutional
Networks at the Edge [21.72253397805102]
This work focuses in particular on Temporal Convolutional Networks (TCNs), a convolutional model for time-series processing.
We propose the first NAS tool that explicitly targets the optimization of the most peculiar architectural parameters of TCNs.
We test the proposed NAS on four real-world, edge-relevant tasks, involving audio and bio-signals.
arXiv Detail & Related papers (2023-01-24T19:47:40Z) - MAPLE-X: Latency Prediction with Explicit Microprocessor Prior Knowledge [87.41163540910854]
Deep neural network (DNN) latency characterization is a time-consuming process.
We propose MAPLE-X which extends MAPLE by incorporating explicit prior knowledge of hardware devices and DNN architecture latency.
arXiv Detail & Related papers (2022-05-25T11:08:20Z) - FLASH: Fast Neural Architecture Search with Hardware Optimization [7.263481020106725]
Neural architecture search (NAS) is a promising technique to design efficient and high-performance deep neural networks (DNNs)
This paper proposes FLASH, a very fast NAS methodology that co-optimizes the DNN accuracy and performance on a real hardware platform.
arXiv Detail & Related papers (2021-08-01T23:46:48Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - Binarized Neural Architecture Search for Efficient Object Recognition [120.23378346337311]
Binarized neural architecture search (BNAS) produces extremely compressed models to reduce huge computational cost on embedded devices for edge computing.
An accuracy of $96.53%$ vs. $97.22%$ is achieved on the CIFAR-10 dataset, but with a significantly compressed model, and a $40%$ faster search than the state-of-the-art PC-DARTS.
arXiv Detail & Related papers (2020-09-08T15:51:23Z) - NASCaps: A Framework for Neural Architecture Search to Optimize the
Accuracy and Hardware Efficiency of Convolutional Capsule Networks [10.946374356026679]
We propose NASCaps, an automated framework for the hardware-aware NAS of different types of Deep Neural Networks (DNNs)
We study the efficacy of deploying a multi-objective Genetic Algorithm (e.g., based on the NSGA-II algorithm)
Our framework is the first to model and supports the specialized capsule layers and dynamic routing in the NAS-flow.
arXiv Detail & Related papers (2020-08-19T14:29:36Z) - BRP-NAS: Prediction-based NAS using GCNs [21.765796576990137]
BRP-NAS is an efficient hardware-aware NAS enabled by an accurate performance predictor-based on graph convolutional network (GCN)
We show that our proposed method outperforms all prior methods on NAS-Bench-101 and NAS-Bench-201.
We also release LatBench -- a latency dataset of NAS-Bench-201 models running on a broad range of devices.
arXiv Detail & Related papers (2020-07-16T21:58:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.