Automating Neural Architecture Design without Search
- URL: http://arxiv.org/abs/2204.11838v1
- Date: Thu, 21 Apr 2022 14:41:05 GMT
- Title: Automating Neural Architecture Design without Search
- Authors: Zixuan Liang, Yanan Sun
- Abstract summary: We study the automated architecture design from a new perspective that eliminates the need to sequentially evaluate each neural architecture generated during algorithm execution.
We implement the proposed approach by using a graph neural network for link prediction and acquired the knowledge from NAS-Bench-101.
In addition, we also utilized the learned knowledge from NAS-Bench-101 to automate architecture design in the DARTS search space, and achieved 97.82% accuracy on CIFAR10, and 76.51% top-1 accuracy on ImageNet consuming only $2times10-4$ GPU days.
- Score: 3.651848964235307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural structure search (NAS), as the mainstream approach to automate deep
neural architecture design, has achieved much success in recent years. However,
the performance estimation component adhering to NAS is often prohibitively
costly, which leads to the enormous computational demand. Though a large number
of efforts have been dedicated to alleviating this pain point, no consensus has
been made yet on which is optimal. In this paper, we study the automated
architecture design from a new perspective that eliminates the need to
sequentially evaluate each neural architecture generated during algorithm
execution. Specifically, the proposed approach is built by learning the
knowledge of high-level experts in designing state-of-the-art architectures,
and then the new architecture is directly generated upon the knowledge learned.
We implemented the proposed approach by using a graph neural network for link
prediction and acquired the knowledge from NAS-Bench-101. Compared to existing
peer competitors, we found a competitive network with minimal cost. In
addition, we also utilized the learned knowledge from NAS-Bench-101 to automate
architecture design in the DARTS search space, and achieved 97.82% accuracy on
CIFAR10, and 76.51% top-1 accuracy on ImageNet consuming only $2\times10^{-4}$
GPU days. This also demonstrates the high transferability of the proposed
approach, and can potentially lead to a new, more computationally efficient
paradigm in this research direction.
Related papers
- DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - A General-Purpose Transferable Predictor for Neural Architecture Search [22.883809911265445]
We propose a general-purpose neural predictor for Neural Architecture Search (NAS) that can transfer across search spaces.
Experimental results on NAS-Bench-101, 201 and 301 demonstrate the efficacy of our scheme.
arXiv Detail & Related papers (2023-02-21T17:28:05Z) - Neural Architecture Search: Insights from 1000 Papers [50.27255667347091]
We provide an organized and comprehensive guide to neural architecture search.
We give a taxonomy of search spaces, algorithms, and speedup techniques.
We discuss resources such as benchmarks, best practices, other surveys, and open-source libraries.
arXiv Detail & Related papers (2023-01-20T18:47:24Z) - NAAP-440 Dataset and Baseline for Neural Architecture Accuracy
Prediction [1.2183405753834562]
We introduce the NAAP-440 dataset of 440 neural architectures, which were trained on CIFAR10 using a fixed recipe.
Experiments indicate that by using off-the-shelf regression algorithms and running up to 10% of the training process, not only is it possible to predict an architecture's accuracy rather precisely.
This approach may serve as a powerful tool for accelerating NAS-based studies and thus dramatically increase their efficiency.
arXiv Detail & Related papers (2022-09-14T13:21:39Z) - BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS)
We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately.
We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z) - Neural Architecture Performance Prediction Using Graph Neural Networks [17.224223176258334]
We propose a surrogate model for neural architecture performance prediction built upon Graph Neural Networks (GNN)
We demonstrate the effectiveness of this surrogate model on neural architecture performance prediction for structurally unknown architectures.
arXiv Detail & Related papers (2020-10-19T09:33:57Z) - MS-RANAS: Multi-Scale Resource-Aware Neural Architecture Search [94.80212602202518]
We propose Multi-Scale Resource-Aware Neural Architecture Search (MS-RANAS)
We employ a one-shot architecture search approach in order to obtain a reduced search cost.
We achieve state-of-the-art results in terms of accuracy-speed trade-off.
arXiv Detail & Related papers (2020-09-29T11:56:01Z) - A Semi-Supervised Assessor of Neural Architectures [157.76189339451565]
We employ an auto-encoder to discover meaningful representations of neural architectures.
A graph convolutional neural network is introduced to predict the performance of architectures.
arXiv Detail & Related papers (2020-05-14T09:02:33Z) - Stage-Wise Neural Architecture Search [65.03109178056937]
Modern convolutional networks such as ResNet and NASNet have achieved state-of-the-art results in many computer vision applications.
These networks consist of stages, which are sets of layers that operate on representations in the same resolution.
It has been demonstrated that increasing the number of layers in each stage improves the prediction ability of the network.
However, the resulting architecture becomes computationally expensive in terms of floating point operations, memory requirements and inference time.
arXiv Detail & Related papers (2020-04-23T14:16:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.