Related papers: Predict NAS Multi-Task by Stacking Ensemble Models using GP-NAS

Predict NAS Multi-Task by Stacking Ensemble Models using GP-NAS

URL: http://arxiv.org/abs/2305.01667v1
Date: Tue, 2 May 2023 13:59:58 GMT
Title: Predict NAS Multi-Task by Stacking Ensemble Models using GP-NAS
Authors: Ke Zhang
Abstract summary: How to analysis and train dataset to overcome overfitting is the core problem we should deal with. Our stacking model ranked 1st in CVPR 2022 Track 2 Challenge.
Score: 1.819714933798177
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Accurately predicting the performance of architecture with small sample training is an important but not easy task. How to analysis and train dataset to overcome overfitting is the core problem we should deal with. Meanwhile if there is the mult-task problem, we should also think about if we can take advantage of their correlation and estimate as fast as we can. In this track, Super Network builds a search space based on ViT-Base. The search space contain depth, num-heads, mpl-ratio and embed-dim. What we done firstly are pre-processing the data based on our understanding of this problem which can reduce complexity of problem and probability of over fitting. Then we tried different kind of models and different way to combine them. Finally we choose stacking ensemble models using GP-NAS with cross validation. Our stacking model ranked 1st in CVPR 2022 Track 2 Challenge.

Related papers

Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost [73.28626942658022]
We aim at exploiting additional auxiliary labels from an independent (auxiliary) task to boost the primary task performance. Our method is architecture-based with a flexible asymmetric structure for the primary and auxiliary tasks. Experiments with six tasks on NYU v2, CityScapes, and Taskonomy datasets using VGG, ResNet, and ViT backbones validate the promising performance.
arXiv Detail & Related papers (2024-05-09T11:50:19Z)
DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions [121.05720140641189]
We develop a family of models with the distilling neural architecture (DNA) techniques. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using algorithms. Our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively.
arXiv Detail & Related papers (2024-03-02T22:16:47Z)
Arch-Graph: Acyclic Architecture Relation Predictor for Task-Transferable Neural Architecture Search [96.31315520244605]
Arch-Graph is a transferable NAS method that predicts task-specific optimal architectures. We show Arch-Graph's transferability and high sample efficiency across numerous tasks. It is able to find top 0.16% and 0.29% architectures on average on two search spaces under the budget of only 50 models.
arXiv Detail & Related papers (2022-04-12T16:46:06Z)
Generalization Guarantees for Neural Architecture Search with Train-Validation Split [48.265305046655996]
This paper explores the statistical aspects of such problems with train-validation splits. We show that refined properties of the validation loss such as risk and hyper-gradients are indicative of those of the true test loss. We also highlight rigorous connections between NAS, multiple kernel learning, and low-rank matrix learning.
arXiv Detail & Related papers (2021-04-29T06:11:00Z)
MTL-NAS: Task-Agnostic Neural Architecture Search towards General-Purpose Multi-Task Learning [71.90902837008278]
We propose to incorporate neural architecture search (NAS) into general-purpose multi-task learning (GP-MTL) In order to adapt to different task combinations, we disentangle the GP-MTL networks into single-task backbones. We also propose a novel single-shot gradient-based search algorithm that closes the performance gap between the searched architectures.
arXiv Detail & Related papers (2020-03-31T09:49:14Z)
DA-NAS: Data Adapted Pruning for Efficient Neural Architecture Search [76.9225014200746]
Efficient search is a core issue in Neural Architecture Search (NAS) We present DA-NAS that can directly search the architecture for large-scale target tasks while allowing a large candidate set in a more efficient manner. It is 2x faster than previous methods while the accuracy is currently state-of-the-art, at 76.2% under small FLOPs constraint.
arXiv Detail & Related papers (2020-03-27T17:55:21Z)
BigNAS: Scaling Up Neural Architecture Search with Big Single-Stage Models [59.95091850331499]
We propose BigNAS, an approach that challenges the conventional wisdom that post-processing of the weights is necessary to get good prediction accuracies. Our discovered model family, BigNASModels, achieve top-1 accuracies ranging from 76.5% to 80.9%.
arXiv Detail & Related papers (2020-03-24T23:00:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.