Fugu-MT 論文翻訳(概要): FP-NAS: Fast Probabilistic Neural Architecture Search

論文の概要: FP-NAS: Fast Probabilistic Neural Architecture Search

arxiv url: http://arxiv.org/abs/2011.10949v3
Date: Wed, 31 Mar 2021 17:21:49 GMT
ステータス: 翻訳完了
システム内更新日: 2022-09-22 09:00:10.270971
Title: FP-NAS: Fast Probabilistic Neural Architecture Search
Title（参考訳）: FP-NAS: 高速確率論的ニューラルネットワーク探索
Authors: Zhicheng Yan, Xiaoliang Dai, Peizhao Zhang, Yuandong Tian, Bichen Wu, Matt Feiszli
Abstract要約: PARSECのような確率的NASは高性能アーキテクチャ上の分布を学習し、単一のモデルをトレーニングするのに必要なメモリだけを使用する。本稿では,分布エントロピーに適応したサンプリング手法を提案する。 FP-NAS(Fast Probabilistic NAS)はアーキテクチャを64%削減し、PARSECより2.1倍高速に検索できることを示す。
参考スコア（独自算出の注目度）: 49.21560787752714
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Differential Neural Architecture Search (NAS) requires all layer choices to be held in memory simultaneously; this limits the size of both search space and final architecture. In contrast, Probabilistic NAS, such as PARSEC, learns a distribution over high-performing architectures, and uses only as much memory as needed to train a single model. Nevertheless, it needs to sample many architectures, making it computationally expensive for searching in an extensive space. To solve these problems, we propose a sampling method adaptive to the distribution entropy, drawing more samples to encourage explorations at the beginning, and reducing samples as learning proceeds. Furthermore, to search fast in the multi-variate space, we propose a coarse-to-fine strategy by using a factorized distribution at the beginning which can reduce the number of architecture parameters by over an order of magnitude. We call this method Fast Probabilistic NAS (FP-NAS). Compared with PARSEC, it can sample 64% fewer architectures and search 2.1x faster. Compared with FBNetV2, FP-NAS is 1.9x - 3.5x faster, and the searched models outperform FBNetV2 models on ImageNet. FP-NAS allows us to expand the giant FBNetV2 space to be wider (i.e. larger channel choices) and deeper (i.e. more blocks), while adding Split-Attention block and enabling the search over the number of splits. When searching a model of size 0.4G FLOPS, FP-NAS is 132x faster than EfficientNet, and the searched FP-NAS-L0 model outperforms EfficientNet-B0 by 0.7% accuracy. Without using any architecture surrogate or scaling tricks, we directly search large models up to 1.0G FLOPS. Our FP-NAS-L2 model with simple distillation outperforms BigNAS-XL with advanced in-place distillation by 0.7% accuracy using similar FLOPS.
Abstract（参考訳）: ディファレンシャル・ニューラル・アーキテクチャ・サーチ(nas)は、全ての層をメモリに同時に保持する必要がある。対照的に、PARSECのような確率的NASは高い性能のアーキテクチャ上の分布を学習し、単一のモデルをトレーニングするのに必要なメモリだけを使用する。それでも、多くのアーキテクチャをサンプリングする必要があるため、広い空間を探索するのに計算コストがかかる。これらの問題を解決するために,分布エントロピーに適応したサンプリング手法を提案する。さらに,多変量空間を高速に探索するために,初期における因子分布を用いて一桁以上のアーキテクチャパラメータ数を削減できる粗大な戦略を提案する。我々はこの手法を高速確率NAS (FP-NAS) と呼ぶ。 PARSECと比較すると、64%のアーキテクチャをサンプリングし、2.1倍高速に検索できる。 FBNetV2と比較すると、FP-NASは1.9x3.5倍高速で、検索されたモデルはImageNetでFBNetV2モデルより優れている。 FP-NASにより、巨大なFBNetV2空間を拡大し(チャネルの選択が大きくなる)、より深く(ブロックが増える)、またSplit-Attentionブロックを追加し、分割数の探索を可能にします。サイズ0.4GのFLOPSを探索すると、FP-NASはEfficientNetより132倍高速で、探索されたFP-NAS-L0モデルはEfficientNet-B0よりも0.7%精度で優れている。アーキテクチャのサロゲートやスケーリングのトリックを使わずに、1.0gフロップまで大きなモデルを直接検索します。簡易蒸留によるFP-NAS-L2モデルでは, 同様のFLOPSを用いて, 高度蒸留を0.7%精度で行うことができる。

論文の概要: FP-NAS: Fast Probabilistic Neural Architecture Search

関連論文リスト