Fugu-MT 論文翻訳(概要): CL-MAE: Curriculum-Learned Masked Autoencoders

論文の概要: CL-MAE: Curriculum-Learned Masked Autoencoders

arxiv url: http://arxiv.org/abs/2308.16572v2
Date: Wed, 25 Oct 2023 11:10:57 GMT
ステータス: 翻訳完了
システム内更新日: 2023-10-26 19:49:14.205483
Title: CL-MAE: Curriculum-Learned Masked Autoencoders
Title（参考訳）: CL-MAE:カリキュラム学習型マスクオートエンコーダ
Authors: Neelu Madan, Nicolae-Catalin Ristea, Kamal Nasrollahi, Thomas B. Moeslund, Radu Tudor Ionescu
Abstract要約: 本稿では,自己指導型再建作業の複雑さを継続的に増大させるために,マスキング戦略を更新するカリキュラム学習手法を提案する。我々は、ImageNet上でCL-MAE(Curriculum-Learned Masked Autoencoder)をトレーニングし、MAEよりも優れた表現学習能力を示すことを示す。
参考スコア（独自算出の注目度）: 49.24994655813455
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Masked image modeling has been demonstrated as a powerful pretext task for generating robust representations that can be effectively generalized across multiple downstream tasks. Typically, this approach involves randomly masking patches (tokens) in input images, with the masking strategy remaining unchanged during training. In this paper, we propose a curriculum learning approach that updates the masking strategy to continually increase the complexity of the self-supervised reconstruction task. We conjecture that, by gradually increasing the task complexity, the model can learn more sophisticated and transferable representations. To facilitate this, we introduce a novel learnable masking module that possesses the capability to generate masks of different complexities, and integrate the proposed module into masked autoencoders (MAE). Our module is jointly trained with the MAE, while adjusting its behavior during training, transitioning from a partner to the MAE (optimizing the same reconstruction loss) to an adversary (optimizing the opposite loss), while passing through a neutral state. The transition between these behaviors is smooth, being regulated by a factor that is multiplied with the reconstruction loss of the masking module. The resulting training procedure generates an easy-to-hard curriculum. We train our Curriculum-Learned Masked Autoencoder (CL-MAE) on ImageNet and show that it exhibits superior representation learning capabilities compared to MAE. The empirical results on five downstream tasks confirm our conjecture, demonstrating that curriculum learning can be successfully used to self-supervise masked autoencoders. We release our code at https://github.com/ristea/cl-mae.
Abstract（参考訳）: マスク付き画像モデリングは、複数の下流タスクで効果的に一般化できる堅牢な表現を生成するための強力なプリテキストタスクとして実証されている。通常、このアプローチは入力画像のパッチ(トークン)をランダムにマスキングするが、トレーニング中にマスク戦略は変わらない。本稿では,マスキング戦略をアップデートし,自己監督型再構築作業の複雑さを継続的に高めるカリキュラム学習手法を提案する。タスクの複雑さを徐々に増大させることで、モデルはより高度で伝達可能な表現を学ぶことができると推測する。これを容易にするために,異なる複雑なマスクを生成する能力を有する新しい学習可能なマスキングモジュールを導入し,提案モジュールをマスク付きオートエンコーダ(MAE)に統合する。我々のモジュールは、トレーニング中の動作を調整しながら、MAEと共同でトレーニングされ、パートナーからMAEへ(同じ復元損失を最適化)、敵へ(反対損失を最適化)し、中立状態を通過する。これらの挙動間の遷移は滑らかであり、マスキングモジュールの再構成損失に乗じる因子によって制御される。得られたトレーニング手順は、難易度の高いカリキュラムを生成する。我々は、ImageNet上でCL-MAE(Curriculum-Learned Masked Autoencoder)をトレーニングし、MAEよりも優れた表現学習能力を示すことを示す。 5つの下流タスクにおける実証的な結果から,カリキュラム学習が自己監督型オートエンコーダに有効であることを示す。コードはhttps://github.com/ristea/cl-maeでリリースします。

論文の概要: CL-MAE: Curriculum-Learned Masked Autoencoders

関連論文リスト