Fugu-MT 論文翻訳(概要): Self-discipline on multiple channels

論文の概要: Self-discipline on multiple channels

arxiv url: http://arxiv.org/abs/2304.14224v1
Date: Thu, 27 Apr 2023 14:34:41 GMT
ステータス: 翻訳完了
システム内更新日: 2023-04-28 13:10:22.956414
Title: Self-discipline on multiple channels
Title（参考訳）: 複数チャンネル上の自己学際
Authors: Jiutian Zhao, Liang Luo, Hao Wang
Abstract要約: 既存の自己蒸留法では、トレーニングのために追加のモデル、モデル修正、バッチサイズ拡張が必要である。本稿では, 整合正則化と自己蒸留を組み合わせたマルチチャンネル(SMC)上でのセルフディシプリリンを開発した。 SMCはモデルの一般化能力を改善するために一貫した正則化と自己蒸留を用いる。
参考スコア（独自算出の注目度）: 3.9860001037346264
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-distillation relies on its own information to improve the generalization ability of the model and has a bright future. Existing self-distillation methods either require additional models, model modification, or batch size expansion for training, which increases the difficulty of use, memory consumption, and computational cost. This paper developed Self-discipline on multiple channels(SMC), which combines consistency regularization with self-distillation using the concept of multiple channels. Conceptually, SMC consists of two steps: 1) each channel data is simultaneously passed through the model to obtain its corresponding soft label, and 2) the soft label saved in the previous step is read together with the soft label obtained from the current channel data through the model to calculate the loss function. SMC uses consistent regularization and self-distillation to improve the generalization ability of the model and the robustness of the model to noisy labels. We named the SMC containing only two channels as SMC-2. Comparative experimental results on both datasets show that SMC-2 outperforms Label Smoothing Regularizaion and Self-distillation From The Last Mini-batch on all models, and outperforms the state-of-the-art Sharpness-Aware Minimization method on 83% of the models.Compatibility of SMC-2 and data augmentation experimental results show that using both SMC-2 and data augmentation improves the generalization ability of the model between 0.28% and 1.80% compared to using only data augmentation. Ultimately, the results of the label noise interference experiments show that SMC-2 curbs the tendency that the model's generalization ability decreases in the late training period due to the interference of label noise. The code is available at https://github.com/JiuTiannn/SMC-Self-discipline-on-multiple-channels.
Abstract（参考訳）: 自己蒸留は、モデルの一般化能力を改善するために独自の情報に依存しており、明るい未来を持っている。既存の自己蒸留法は、追加のモデル、モデル修正、訓練のためのバッチサイズ拡張を必要とするため、使用の困難さ、メモリ消費、計算コストが増大する。本稿では,複数チャネルの概念を用いて,一貫性の正規化と自己蒸留を組み合わせたマルチチャネル(smc)上の自己分散法を開発した。概念的には、smcは2つのステップからなる。 1) 各チャネルデータは同時にモデルに渡され、対応するソフトラベルが取得され、 2) 前段で保存したソフトラベルを、モデルを介して現在のチャネルデータから得られたソフトラベルと共に読み出し、損失関数を算出する。 SMCは、モデルの一般化能力とノイズラベルに対するモデルの堅牢性を改善するために、一貫した正則化と自己蒸留を用いる。 SMCは2チャンネルのみをSMC-2と命名した。両データセットの比較実験結果から,SMC-2はモデル全体の83%において,全モデルにおける最終ミニバッチからのラベルの平滑化や自己蒸留よりも優れ,最先端のシャープネス・アウェアの最小化手法よりも優れており,SMC-2とデータ拡張実験の結果から,SMC-2とデータ拡張によるモデル全体の一般化能力は0.28%から1.80%向上していることがわかった。ラベルノイズ干渉実験の結果、SMC-2はラベルノイズの干渉によりモデルの一般化能力が後期トレーニング期間に低下する傾向を抑えることが示された。コードはhttps://github.com/jiutiannn/smc-self-discipline-on-multiple-channelsで入手できる。

論文の概要: Self-discipline on multiple channels

関連論文リスト