Fugu-MT 論文翻訳(概要): Dual Lottery Ticket Hypothesis

論文の概要: Dual Lottery Ticket Hypothesis

arxiv url: http://arxiv.org/abs/2203.04248v1
Date: Tue, 8 Mar 2022 18:06:26 GMT
ステータス: 翻訳完了
システム内更新日: 2022-03-09 15:14:23.518853
Title: Dual Lottery Ticket Hypothesis
Title（参考訳）: 二重抽選券仮説
Authors: Yue Bai, Huan Wang, Zhiqiang Tao, Kunpeng Li, Yun Fu
Abstract要約: Lottery Ticket hypothesis (LTH)は、スパースネットワークトレーニングを調査し、その能力を維持するための新しい視点を提供する。本稿では,LTHの当選チケットをトレーニング可能なサブネットワークとして,その性能をベンチマークとして検討する。本稿では,簡単なスパースネットワークトレーニング戦略であるランダムスパースネットワークトランスフォーメーション(RST)を提案し,DLTHを裏付ける。
参考スコア（独自算出の注目度）: 71.95937879869334
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Fully exploiting the learning capacity of neural networks requires overparameterized dense networks. On the other side, directly training sparse neural networks typically results in unsatisfactory performance. Lottery Ticket Hypothesis (LTH) provides a novel view to investigate sparse network training and maintain its capacity. Concretely, it claims there exist winning tickets from a randomly initialized network found by iterative magnitude pruning and preserving promising trainability (or we say being in trainable condition). In this work, we regard the winning ticket from LTH as the subnetwork which is in trainable condition and its performance as our benchmark, then go from a complementary direction to articulate the Dual Lottery Ticket Hypothesis (DLTH): Randomly selected subnetworks from a randomly initialized dense network can be transformed into a trainable condition and achieve admirable performance compared with LTH -- random tickets in a given lottery pool can be transformed into winning tickets. Specifically, by using uniform-randomly selected subnetworks to represent the general cases, we propose a simple sparse network training strategy, Random Sparse Network Transformation (RST), to substantiate our DLTH. Concretely, we introduce a regularization term to borrow learning capacity and realize information extrusion from the weights which will be masked. After finishing the transformation for the randomly selected subnetworks, we conduct the regular finetuning to evaluate the model using fair comparisons with LTH and other strong baselines. Extensive experiments on several public datasets and comparisons with competitive approaches validate our DLTH as well as the effectiveness of the proposed model RST. Our work is expected to pave a way for inspiring new research directions of sparse network training in the future. Our code is available at https://github.com/yueb17/DLTH.
Abstract（参考訳）: ニューラルネットワークの学習能力を完全に活用するには、過パラメータの高密度ネットワークが必要である。一方、スパースニューラルネットワークを直接トレーニングすると、通常は不十分なパフォーマンスになる。 Lottery Ticket hypothesis (LTH)は、スパースネットワークトレーニングを調査し、その能力を維持するための新しい視点を提供する。具体的には、ランダムに初期化されたネットワークから、反復的なマグニチュードの刈り取りと、有望なトレーサビリティ(つまり訓練可能な状態にある)の維持によって、入賞チケットが存在すると主張している。本稿では, LTH の当選チケットを, トレーニング可能な状態にあるサブネットワークであり, その性能をベンチマークとして, 補完的な方向から, ランダムに初期化された高密度ネットワークからランダムに選択されたサブネットワークをトレーニング可能な状態に変換し, LTH と比較して有意な性能を達成することができる。具体的には,一様ランダムに選択したサブネットワークを用いて汎用ケースを表現することにより,単純なスパースネットワークトレーニング戦略である乱数スパースネットワークトランスフォーメーション(rst)を提案する。具体的には,学習能力を借りる正規化用語を導入し,マスクする重みからの情報押出を実現する。ランダムに選択されたサブネットの変換を完了した後、LTHや他の強いベースラインと公正に比較してモデルの評価を行う。いくつかのパブリックデータセットに関する広範囲な実験と競合アプローチとの比較により,提案モデルrstの有効性とdlthの有効性が検証された。我々の研究は、将来、スパースネットワークトレーニングの新しい研究方向性を刺激する道を開くことが期待されている。私たちのコードはhttps://github.com/yueb17/dlthで利用可能です。

論文の概要: Dual Lottery Ticket Hypothesis

関連論文リスト