Fugu-MT 論文翻訳(概要): DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

論文の概要: DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

arxiv url: http://arxiv.org/abs/2403.01326v1
Date: Sat, 2 Mar 2024 22:16:47 GMT
ステータス: 翻訳完了
システム内更新日: 2024-03-05 14:21:58.521365
Title: DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions
Title（参考訳）: DNAファミリー:重量共有NASをブロックワイズで強化
Authors: Guangrun Wang, Changlin Li, Liuchun Yuan, Jiefeng Peng, Xiaoyu Xian, Xiaodan Liang, Xiaojun Chang, and Liang Lin
Abstract要約: 蒸留型ニューラルアーキテクチャ(DNA)技術を用いたモデル群を開発した。提案するDNAモデルでは,アルゴリズムを用いてサブサーチ空間にのみアクセス可能な従来の手法とは対照的に,すべてのアーキテクチャ候補を評価できる。当社のモデルでは,モバイルコンボリューションネットワークと小型ビジョントランスフォーマーにおいて,ImageNet上で78.9%,83.6%の最先端トップ1精度を実現している。
参考スコア（独自算出の注目度）: 121.05720140641189
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural Architecture Search (NAS), aiming at automatically designing neural architectures by machines, has been considered a key step toward automatic machine learning. One notable NAS branch is the weight-sharing NAS, which significantly improves search efficiency and allows NAS algorithms to run on ordinary computers. Despite receiving high expectations, this category of methods suffers from low search effectiveness. By employing a generalization boundedness tool, we demonstrate that the devil behind this drawback is the untrustworthy architecture rating with the oversized search space of the possible architectures. Addressing this problem, we modularize a large search space into blocks with small search spaces and develop a family of models with the distilling neural architecture (DNA) techniques. These proposed models, namely a DNA family, are capable of resolving multiple dilemmas of the weight-sharing NAS, such as scalability, efficiency, and multi-modal compatibility. Our proposed DNA models can rate all architecture candidates, as opposed to previous works that can only access a sub- search space using heuristic algorithms. Moreover, under a certain computational complexity constraint, our method can seek architectures with different depths and widths. Extensive experimental evaluations show that our models achieve state-of-the-art top-1 accuracy of 78.9% and 83.6% on ImageNet for a mobile convolutional network and a small vision transformer, respectively. Additionally, we provide in-depth empirical analysis and insights into neural architecture ratings. Codes available: \url{https://github.com/changlin31/DNA}.
Abstract（参考訳）: 機械によるニューラルネットワークの自動設計を目的としたニューラルアーキテクチャサーチ(NAS)は、自動機械学習への重要なステップと考えられている。注目すべきNASブランチは、検索効率を大幅に改善し、NASアルゴリズムを通常のコンピュータ上で実行可能にする重み共有NASである。期待が高いにもかかわらず、この分類は検索効率の低下に苦しむ。一般化有界性ツールを用いることで、この欠点の裏側にあるデビルが、可能なアーキテクチャの検索空間が大きすぎる信頼できないアーキテクチャ評価であることを示す。この問題に対処するため,我々は大きな探索空間を小さな探索空間でブロックにモジュール化し,蒸留ニューラル・アーキテクチャ(dna)技術を用いたモデル群を開発する。これらのモデル、すなわちdnaファミリーは、スケーラビリティ、効率性、マルチモーダル互換性など、重量共有nasの複数のジレンマを解決することができる。提案したDNAモデルは、ヒューリスティックアルゴリズムを用いてサブサーチ空間にしかアクセスできない以前の研究とは対照的に、全てのアーキテクチャ候補を評価できる。さらに,ある計算複雑性制約の下では,異なる深さと幅のアーキテクチャを求めることができる。広範な実験結果から,モバイル畳み込みネットワークと小型視覚トランスフォーマのimagenetにおける最先端top-1精度は78.9%,83.6%であった。さらに、神経アーキテクチャの評価に関する詳細な経験的分析と洞察を提供する。コード: \url{https://github.com/changlin31/DNA}。

論文の概要: DNA Family: Boosting Weight-Sharing NAS with Block-Wise Supervisions

関連論文リスト