Related papers: Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search

Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search

URL: http://arxiv.org/abs/2109.12426v1
Date: Sat, 25 Sep 2021 19:34:45 GMT
Title: Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture Search
Authors: Keith G. Mills, Fred X. Han, Jialin Zhang, Seyed Saeed Changiz Rezaei, Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui and Di Niu
Abstract summary: We analyze the neural blocks used to build Once-for-All (MobileNetV3), ProxylessNAS and ResNet families. We show that searching in the reduced search space generates better accuracy-latency frontiers than searching in the original search spaces.
Score: 21.48915618572691
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural architecture search automates neural network design and has achieved state-of-the-art results in many deep learning applications. While recent literature has focused on designing networks to maximize accuracy, little work has been conducted to understand the compatibility of architecture design spaces to varying hardware. In this paper, we analyze the neural blocks used to build Once-for-All (MobileNetV3), ProxylessNAS and ResNet families, in order to understand their predictive power and inference latency on various devices, including Huawei Kirin 9000 NPU, RTX 2080 Ti, AMD Threadripper 2990WX, and Samsung Note10. We introduce a methodology to quantify the friendliness of neural blocks to hardware and the impact of their placement in a macro network on overall network performance via only end-to-end measurements. Based on extensive profiling results, we derive design insights and apply them to hardware-specific search space reduction. We show that searching in the reduced search space generates better accuracy-latency Pareto frontiers than searching in the original search spaces, customizing architecture search according to the hardware. Moreover, insights derived from measurements lead to notably higher ImageNet top-1 scores on all search spaces investigated.

Related papers

DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms. Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z)
DONNAv2 -- Lightweight Neural Architecture Search for Vision tasks [6.628409795264665]
We present the next-generation neural architecture design for computationally efficient neural architecture distillation - DONNAv2. DONNAv2 reduces the computational cost of DONNA by 10x for the larger datasets. To improve the quality of NAS search space, DONNAv2 leverages a block knowledge distillation filter to remove blocks with high inference costs.
arXiv Detail & Related papers (2023-09-26T04:48:50Z)
The Larger The Fairer? Small Neural Networks Can Achieve Fairness for Edge Devices [16.159547410954602]
Fairness concerns gradually emerge in many applications, such as face recognition and mobile medical. This work proposes a novel Fairness- and Hardware-aware Neural architecture search framework, namely FaHaNa. We show that FaHaNa can identify a series of neural networks with higher fairness and accuracy on a dermatology dataset.
arXiv Detail & Related papers (2022-02-23T05:26:22Z)
ISyNet: Convolutional Neural Networks design for AI accelerator [0.0]
Current state-of-the-art architectures are found with neural architecture search (NAS) taking model complexity into account. We propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method. We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.
arXiv Detail & Related papers (2021-09-04T20:57:05Z)
Does Form Follow Function? An Empirical Exploration of the Impact of Deep Neural Network Architecture Design on Hardware-Specific Acceleration [76.35307867016336]
This study investigates the impact of deep neural network architecture design on the degree of inference speedup. We show that while leveraging hardware-specific acceleration achieved an average inference speed-up of 380%, the degree of inference speed-up varied drastically depending on the macro-architecture design pattern.
arXiv Detail & Related papers (2021-07-08T23:05:39Z)
BossNAS: Exploring Hybrid CNN-transformers with Block-wisely Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS) We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately. We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z)
Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces [16.920328058816338]
DONNA (Distilling Optimal Neural Network Architectures) is a novel pipeline for rapid neural architecture search and search space exploration. In ImageNet classification, architectures found by DONNA are 20% faster than EfficientNet-B0 and MobileNetV2 on a Nvidia V100 GPU at similar accuracy and 10% faster with 0.5% higher accuracy than MobileNetV2-1.4x on a Samsung S20 smartphone.
arXiv Detail & Related papers (2020-12-16T11:00:19Z)
AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via Visual Attention Condensers [81.17461895644003]
We introduce AttendNets, low-precision, highly compact deep neural networks tailored for on-device image recognition. AttendNets possess deep self-attention architectures based on visual attention condensers. Results show AttendNets have significantly lower architectural and computational complexity when compared to several deep neural networks.
arXiv Detail & Related papers (2020-09-30T01:53:17Z)
MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS) We employ a one-shot architecture search approach in order to obtain a reduced search cost. We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z)
NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural Network Synthesis [53.106414896248246]
We present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge. Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application.
arXiv Detail & Related papers (2020-09-28T01:48:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.