Siamese-NAS: Using Trained Samples Efficiently to Find Lightweight
Neural Architecture by Prior Knowledge
- URL: http://arxiv.org/abs/2210.00546v1
- Date: Sun, 2 Oct 2022 15:04:08 GMT
- Title: Siamese-NAS: Using Trained Samples Efficiently to Find Lightweight
Neural Architecture by Prior Knowledge
- Authors: Yu-Ming Zhang, Jun-Wei Hsieh, Chun-Chieh Lee, Kuo-Chin Fan
- Abstract summary: In recent works, the Neural Predictor has significantly improved with few training architectures as training samples.
In this paper, our proposed Siamese-Predictor is inspired by past works of predictor-based NAS.
It is constructed with the proposed Estimation Code, which is the prior knowledge about the training procedure.
We also propose the search space Tiny-NanoBench for lightweight CNN architecture.
- Score: 6.117917355232904
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In the past decade, many architectures of convolution neural networks were
designed by handcraft, such as Vgg16, ResNet, DenseNet, etc. They all achieve
state-of-the-art level on different tasks in their time. However, it still
relies on human intuition and experience, and it also takes so much time
consumption for trial and error. Neural Architecture Search (NAS) focused on
this issue. In recent works, the Neural Predictor has significantly improved
with few training architectures as training samples. However, the sampling
efficiency is already considerable. In this paper, our proposed
Siamese-Predictor is inspired by past works of predictor-based NAS. It is
constructed with the proposed Estimation Code, which is the prior knowledge
about the training procedure. The proposed Siamese-Predictor gets significant
benefits from this idea. This idea causes it to surpass the current SOTA
predictor on NASBench-201. In order to explore the impact of the Estimation
Code, we analyze the relationship between it and accuracy. We also propose the
search space Tiny-NanoBench for lightweight CNN architecture. This
well-designed search space is easier to find better architecture with few FLOPs
than NASBench-201. In summary, the proposed Siamese-Predictor is a
predictor-based NAS. It achieves the SOTA level, especially with limited
computation budgets. It applied to the proposed Tiny-NanoBench can just use a
few trained samples to find extremely lightweight CNN architecture.
Related papers
- NASiam: Efficient Representation Learning using Neural Architecture
Search for Siamese Networks [76.8112416450677]
Siamese networks are one of the most trending methods to achieve self-supervised visual representation learning (SSL)
NASiam is a novel approach that uses for the first time differentiable NAS to improve the multilayer perceptron projector and predictor (encoder/predictor pair)
NASiam reaches competitive performance in both small-scale (i.e., CIFAR-10/CIFAR-100) and large-scale (i.e., ImageNet) image classification datasets while costing only a few GPU hours.
arXiv Detail & Related papers (2023-01-31T19:48:37Z) - PRE-NAS: Predictor-assisted Evolutionary Neural Architecture Search [34.06028035262884]
We propose a novel evolutionary-based NAS strategy, Predictor-assisted E-NAS (PRE-NAS)
PRE-NAS leverages new evolutionary search strategies and integrates high-fidelity weight inheritance over generations.
Experiments on NAS-Bench-201 and DARTS search spaces show that PRE-NAS can outperform state-of-the-art NAS methods.
arXiv Detail & Related papers (2022-04-27T06:40:39Z) - AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision
of Weight Sharing [6.171090327531059]
We introduce Learning to Rank methods to select the best (ace) architectures from a space.
We also propose to leverage weak supervision from weight sharing by pretraining architecture representation on weak labels obtained from the super-net.
Experiments on NAS benchmarks and large-scale search spaces demonstrate that our approach outperforms SOTA with a significantly reduced search cost.
arXiv Detail & Related papers (2021-08-06T08:31:42Z) - Neural Architecture Search on ImageNet in Four GPU Hours: A
Theoretically Inspired Perspective [88.39981851247727]
We propose a novel framework called training-free neural architecture search (TE-NAS)
TE-NAS ranks architectures by analyzing the spectrum of the neural tangent kernel (NTK) and the number of linear regions in the input space.
We show that: (1) these two measurements imply the trainability and expressivity of a neural network; (2) they strongly correlate with the network's test accuracy.
arXiv Detail & Related papers (2021-02-23T07:50:44Z) - Weak NAS Predictors Are All You Need [91.11570424233709]
Recent predictor-based NAS approaches attempt to solve the problem with two key steps: sampling some architecture-performance pairs and fitting a proxy accuracy predictor.
We shift the paradigm from finding a complicated predictor that covers the whole architecture space to a set of weaker predictors that progressively move towards the high-performance sub-space.
Our method costs fewer samples to find the top-performance architectures on NAS-Bench-101 and NAS-Bench-201, and it achieves the state-of-the-art ImageNet performance on the NASNet search space.
arXiv Detail & Related papers (2021-02-21T01:58:43Z) - Hierarchical Neural Architecture Search for Deep Stereo Matching [131.94481111956853]
We propose the first end-to-end hierarchical NAS framework for deep stereo matching.
Our framework incorporates task-specific human knowledge into the neural architecture search framework.
It is ranked at the top 1 accuracy on KITTI stereo 2012, 2015 and Middlebury benchmarks, as well as the top 1 on SceneFlow dataset.
arXiv Detail & Related papers (2020-10-26T11:57:37Z) - Accuracy Prediction with Non-neural Model for Neural Architecture Search [185.0651567642238]
We study an alternative approach which uses non-neural model for accuracy prediction.
We leverage gradient boosting decision tree (GBDT) as the predictor for Neural architecture search (NAS)
Experiments on NASBench-101 and ImageNet demonstrate the effectiveness of using GBDT as predictor for NAS.
arXiv Detail & Related papers (2020-07-09T13:28:49Z) - Neural Architecture Search without Training [8.067283219068832]
In this work, we examine the overlap of activations between datapoints in untrained networks.
We motivate how this can give a measure which is usefully indicative of a network's trained performance.
We incorporate this measure into a simple algorithm that allows us to search for powerful networks without any training in a matter of seconds on a single GPU.
arXiv Detail & Related papers (2020-06-08T14:53:56Z) - FBNetV3: Joint Architecture-Recipe Search using Predictor Pretraining [65.39532971991778]
We present an accuracy predictor that scores architecture and training recipes jointly, guiding both sample selection and ranking.
We run fast evolutionary searches in just CPU minutes to generate architecture-recipe pairs for a variety of resource constraints.
FBNetV3 makes up a family of state-of-the-art compact neural networks that outperform both automatically and manually-designed competitors.
arXiv Detail & Related papers (2020-06-03T05:20:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.