AttentiveNAS: Improving Neural Architecture Search via Attentive
Sampling
- URL: http://arxiv.org/abs/2011.09011v2
- Date: Tue, 13 Apr 2021 19:17:16 GMT
- Title: AttentiveNAS: Improving Neural Architecture Search via Attentive
Sampling
- Authors: Dilin Wang, Meng Li, Chengyue Gong, Vikas Chandra
- Abstract summary: Two-stage Neural Architecture Search (NAS) achieves remarkable accuracy and efficiency.
Two-stage NAS requires sampling from the search space during training, which directly impacts the accuracy of the final searched models.
We propose AttentiveNAS that focuses on improving the sampling strategy to achieve better performance Pareto.
Our discovered model family, AttentiveNAS models, achieves top-1 accuracy from 77.3% to 80.7% on ImageNet, and outperforms SOTA models, including BigNAS and Once-for-All networks.
- Score: 39.58754758581108
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural architecture search (NAS) has shown great promise in designing
state-of-the-art (SOTA) models that are both accurate and efficient. Recently,
two-stage NAS, e.g. BigNAS, decouples the model training and searching process
and achieves remarkable search efficiency and accuracy. Two-stage NAS requires
sampling from the search space during training, which directly impacts the
accuracy of the final searched models. While uniform sampling has been widely
used for its simplicity, it is agnostic of the model performance Pareto front,
which is the main focus in the search process, and thus, misses opportunities
to further improve the model accuracy. In this work, we propose AttentiveNAS
that focuses on improving the sampling strategy to achieve better performance
Pareto. We also propose algorithms to efficiently and effectively identify the
networks on the Pareto during training. Without extra re-training or
post-processing, we can simultaneously obtain a large number of networks across
a wide range of FLOPs. Our discovered model family, AttentiveNAS models,
achieves top-1 accuracy from 77.3% to 80.7% on ImageNet, and outperforms SOTA
models, including BigNAS and Once-for-All networks. We also achieve ImageNet
accuracy of 80.1% with only 491 MFLOPs. Our training code and pretrained models
are available at https://github.com/facebookresearch/AttentiveNAS.
Related papers
- Multi-Objective Neural Architecture Search by Learning Search Space Partitions [8.4553113915588]
We implement a novel meta-algorithm called LaMOO on neural architecture search (NAS) tasks.
LaMOO speedups the search process by learning a model from observed samples to partition the search space and then focusing on promising regions.
For real-world tasks, LaMOO achieves 97.36% accuracy with only 1.62M #Params on CIFAR10 in only 600 search samples.
arXiv Detail & Related papers (2024-06-01T03:51:34Z) - Searching Efficient Model-guided Deep Network for Image Denoising [61.65776576769698]
We present a novel approach by connecting model-guided design with NAS (MoD-NAS)
MoD-NAS employs a highly reusable width search strategy and a densely connected search block to automatically select the operations of each layer.
Experimental results on several popular datasets show that our MoD-NAS has achieved even better PSNR performance than current state-of-the-art methods.
arXiv Detail & Related papers (2021-04-06T14:03:01Z) - BossNAS: Exploring Hybrid CNN-transformers with Block-wisely
Self-supervised Neural Architecture Search [100.28980854978768]
We present Block-wisely Self-supervised Neural Architecture Search (BossNAS)
We factorize the search space into blocks and utilize a novel self-supervised training scheme, named ensemble bootstrapping, to train each block separately.
We also present HyTra search space, a fabric-like hybrid CNN-transformer search space with searchable down-sampling positions.
arXiv Detail & Related papers (2021-03-23T10:05:58Z) - PEng4NN: An Accurate Performance Estimation Engine for Efficient
Automated Neural Network Architecture Search [0.0]
Neural network (NN) models are increasingly used in scientific simulations, AI, and other high performance computing fields.
NAS attempts to find well-performing NN models for specialized datsets, where performance is measured by key metrics that capture the NN capabilities.
We propose a performance estimation strategy that reduces the resources for training NNs and increases NAS throughput without jeopardizing accuracy.
arXiv Detail & Related papers (2021-01-11T20:49:55Z) - PV-NAS: Practical Neural Architecture Search for Video Recognition [83.77236063613579]
Deep neural networks for video tasks is highly customized and the design of such networks requires domain experts and costly trial and error tests.
Recent advance in network architecture search has boosted the image recognition performance in a large margin.
In this study, we propose a practical solution, namely Practical Video Neural Architecture Search (PV-NAS)
arXiv Detail & Related papers (2020-11-02T08:50:23Z) - Progressive Automatic Design of Search Space for One-Shot Neural
Architecture Search [15.017964136568061]
It has been observed that a model with higher one-shot model accuracy does not necessarily perform better when stand-alone trained.
We propose Progressive Automatic Design of search space, named PAD-NAS.
In this way, PAD-NAS can automatically design the operations for each layer and achieve a trade-off between search space quality and model diversity.
arXiv Detail & Related papers (2020-05-15T14:21:07Z) - BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage
Models [59.95091850331499]
We propose BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies.
Our discovered model family, BigNASModels, achieve top-1 accuracies ranging from 76.5% to 80.9%.
arXiv Detail & Related papers (2020-03-24T23:00:49Z) - DDPNAS: Efficient Neural Architecture Search via Dynamic Distribution
Pruning [135.27931587381596]
We propose an efficient and unified NAS framework termed DDPNAS via dynamic distribution pruning.
In particular, we first sample architectures from a joint categorical distribution. Then the search space is dynamically pruned and its distribution is updated every few epochs.
With the proposed efficient network generation method, we directly obtain the optimal neural architectures on given constraints.
arXiv Detail & Related papers (2019-05-28T06:35:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.