Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture
Search
- URL: http://arxiv.org/abs/2109.12426v1
- Date: Sat, 25 Sep 2021 19:34:45 GMT
- Title: Profiling Neural Blocks and Design Spaces for Mobile Neural Architecture
Search
- Authors: Keith G. Mills, Fred X. Han, Jialin Zhang, Seyed Saeed Changiz Rezaei,
Fabian Chudak, Wei Lu, Shuo Lian, Shangling Jui and Di Niu
- Abstract summary: We analyze the neural blocks used to build Once-for-All (MobileNetV3), ProxylessNAS and ResNet families.
We show that searching in the reduced search space generates better accuracy-latency frontiers than searching in the original search spaces.
- Score: 21.48915618572691
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural architecture search automates neural network design and has achieved
state-of-the-art results in many deep learning applications. While recent
literature has focused on designing networks to maximize accuracy, little work
has been conducted to understand the compatibility of architecture design
spaces to varying hardware. In this paper, we analyze the neural blocks used to
build Once-for-All (MobileNetV3), ProxylessNAS and ResNet families, in order to
understand their predictive power and inference latency on various devices,
including Huawei Kirin 9000 NPU, RTX 2080 Ti, AMD Threadripper 2990WX, and
Samsung Note10. We introduce a methodology to quantify the friendliness of
neural blocks to hardware and the impact of their placement in a macro network
on overall network performance via only end-to-end measurements. Based on
extensive profiling results, we derive design insights and apply them to
hardware-specific search space reduction. We show that searching in the reduced
search space generates better accuracy-latency Pareto frontiers than searching
in the original search spaces, customizing architecture search according to the
hardware. Moreover, insights derived from measurements lead to notably higher
ImageNet top-1 scores on all search spaces investigated.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - DONNAv2 -- Lightweight Neural Architecture Search for Vision tasks [6.628409795264665]
We present the next-generation neural architecture design for computationally efficient neural architecture distillation - DONNAv2.
DONNAv2 reduces the computational cost of DONNA by 10x for the larger datasets.
To improve the quality of NAS search space, DONNAv2 leverages a block knowledge distillation filter to remove blocks with high inference costs.
arXiv Detail & Related papers (2023-09-26T04:48:50Z) - The Larger The Fairer? Small Neural Networks Can Achieve Fairness for
Edge Devices [16.159547410954602]
Fairness concerns gradually emerge in many applications, such as face recognition and mobile medical.
This work proposes a novel Fairness- and Hardware-aware Neural architecture search framework, namely FaHaNa.
We show that FaHaNa can identify a series of neural networks with higher fairness and accuracy on a dermatology dataset.
arXiv Detail & Related papers (2022-02-23T05:26:22Z) - ISyNet: Convolutional Neural Networks design for AI accelerator [0.0]
Current state-of-the-art architectures are found with neural architecture search (NAS) taking model complexity into account.
We propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method.
We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.
arXiv Detail & Related papers (2021-09-04T20:57:05Z) - Does Form Follow Function? An Empirical Exploration of the Impact of
Deep Neural Network Architecture Design on Hardware-Specific Acceleration [76.35307867016336]
This study investigates the impact of deep neural network architecture design on the degree of inference speedup.
We show that while leveraging hardware-specific acceleration achieved an average inference speed-up of 380%, the degree of inference speed-up varied drastically depending on the macro-architecture design pattern.
arXiv Detail & Related papers (2021-07-08T23:05:39Z) - BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS)
We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately.
We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z) - Distilling Optimal Neural Networks: Rapid Search in Diverse Spaces [16.920328058816338]
DONNA (Distilling Optimal Neural Network Architectures) is a novel pipeline for rapid neural architecture search and search space exploration.
In ImageNet classification, architectures found by DONNA are 20% faster than EfficientNet-B0 and MobileNetV2 on a Nvidia V100 GPU at similar accuracy and 10% faster with 0.5% higher accuracy than MobileNetV2-1.4x on a Samsung S20 smartphone.
arXiv Detail & Related papers (2020-12-16T11:00:19Z) - AttendNets: Tiny Deep Image Recognition Neural Networks for the Edge via
Visual Attention Condensers [81.17461895644003]
We introduce AttendNets, low-precision, highly compact deep neural networks tailored for on-device image recognition.
AttendNets possess deep self-attention architectures based on visual attention condensers.
Results show AttendNets have significantly lower architectural and computational complexity when compared to several deep neural networks.
arXiv Detail & Related papers (2020-09-30T01:53:17Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - NAS-Navigator: Visual Steering for Explainable One-Shot Deep Neural
Network Synthesis [53.106414896248246]
We present a framework that allows analysts to effectively build the solution sub-graph space and guide the network search by injecting their domain knowledge.
Applying this technique in an iterative manner allows analysts to converge to the best performing neural network architecture for a given application.
arXiv Detail & Related papers (2020-09-28T01:48:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.