Predict NAS Multi-Task by Stacking Ensemble Models using GP-NAS
- URL: http://arxiv.org/abs/2305.01667v1
- Date: Tue, 2 May 2023 13:59:58 GMT
- Title: Predict NAS Multi-Task by Stacking Ensemble Models using GP-NAS
- Authors: Ke Zhang
- Abstract summary: How to analysis and train dataset to overcome overfitting is the core problem we should deal with.
Our stacking model ranked 1st in CVPR 2022 Track 2 Challenge.
- Score: 1.819714933798177
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Accurately predicting the performance of architecture with small sample
training is an important but not easy task. How to analysis and train dataset
to overcome overfitting is the core problem we should deal with. Meanwhile if
there is the mult-task problem, we should also think about if we can take
advantage of their correlation and estimate as fast as we can. In this track,
Super Network builds a search space based on ViT-Base. The search space contain
depth, num-heads, mpl-ratio and embed-dim. What we done firstly are
pre-processing the data based on our understanding of this problem which can
reduce complexity of problem and probability of over fitting. Then we tried
different kind of models and different way to combine them. Finally we choose
stacking ensemble models using GP-NAS with cross validation. Our stacking model
ranked 1st in CVPR 2022 Track 2 Challenge.
Related papers
- Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost [73.28626942658022]
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance.
Our method is architecture-based with a flexible asymmetric structure for the primary and auxiliary tasks.
Experiments with six tasks on NYU v2, CityScapes, and Taskonomy datasets using VGG, ResNet, and ViT backbones validate the promising performance.
arXiv Detail & Related papers (2024-05-09T11:50:19Z) - DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques.
Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms.
Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z) - Arch-Graph: Acyclic Architecture Relation Predictor for
Task-Transferable Neural Architecture Search [96.31315520244605]
Arch-Graph is a transferable NAS method that predicts task-specific optimal architectures.
We show Arch-Graph's transferability and high sample efficiency across numerous tasks.
It is able to find top 0.16% and 0.29% architectures on average on two search spaces under the budget of only 50 models.
arXiv Detail & Related papers (2022-04-12T16:46:06Z) - Generalization Guarantees for Neural Architecture Search with
Train-Validation Split [48.265305046655996]
This paper explores the statistical aspects of such problems with train-validation splits.
We show that refined properties of the validation loss such as risk and hyper-gradients are indicative of those of the true test loss.
We also highlight rigorous connections between NAS, multiple kernel learning, and low-rank matrix learning.
arXiv Detail & Related papers (2021-04-29T06:11:00Z) - MTL-NAS: Task-Agnostic Neural Architecture Search towards
General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL)
In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones.
We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z) - DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search [76.9225014200746]
Efficient search is a core issue in Neural Architecture Search (NAS)
We present DA-NAS that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner.
It is 2x faster than previous methods while the accuracy is currently state-of-the-art, at 76.2% under small FLOPs constraint.
arXiv Detail & Related papers (2020-03-27T17:55:21Z) - BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage
Models [59.95091850331499]
We propose BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies.
Our discovered model family, BigNASModels, achieve top-1 accuracies ranging from 76.5% to 80.9%.
arXiv Detail & Related papers (2020-03-24T23:00:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.