Fugu-MT 論文翻訳(概要): EquiMod: An Equivariance Module to Improve Self-Supervised Learning

論文の概要: EquiMod: An Equivariance Module to Improve Self-Supervised Learning

arxiv url: http://arxiv.org/abs/2211.01244v1
Date: Wed, 2 Nov 2022 16:25:54 GMT
ステータス: 翻訳完了
システム内更新日: 2022-11-03 12:34:14.486281
Title: EquiMod: An Equivariance Module to Improve Self-Supervised Learning
Title（参考訳）: EquiMod: 自己改善型学習を改善する等価モジュール
Authors: Alexandre Devillers and Mathieu Lefort
Abstract要約: 自己教師付き視覚表現法は教師付き学習性能とのギャップを埋めている。これらの手法は、データ拡張によって生成された関連する合成入力の埋め込みの類似性を最大化することに依存する。学習された潜在空間を構成する一般同値加群であるEquiModを導入する。
参考スコア（独自算出の注目度）: 77.34726150561087
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Self-supervised visual representation methods are closing the gap with supervised learning performance. These methods rely on maximizing the similarity between embeddings of related synthetic inputs created through data augmentations. This can be seen as a task that encourages embeddings to leave out factors modified by these augmentations, i.e. to be invariant to them. However, this only considers one side of the trade-off in the choice of the augmentations: they need to strongly modify the images to avoid simple solution shortcut learning (e.g. using only color histograms), but on the other hand, augmentations-related information may be lacking in the representations for some downstream tasks (e.g. color is important for birds and flower classification). Few recent works proposed to mitigate the problem of using only an invariance task by exploring some form of equivariance to augmentations. This has been performed by learning additional embeddings space(s), where some augmentation(s) cause embeddings to differ, yet in a non-controlled way. In this work, we introduce EquiMod a generic equivariance module that structures the learned latent space, in the sense that our module learns to predict the displacement in the embedding space caused by the augmentations. We show that applying that module to state-of-the-art invariance models, such as SimCLR and BYOL, increases the performances on CIFAR10 and ImageNet datasets. Moreover, while our model could collapse to a trivial equivariance, i.e. invariance, we observe that it instead automatically learns to keep some augmentations-related information beneficial to the representations.
Abstract（参考訳）: 自己教師付き視覚表現法は教師付き学習性能とのギャップを埋めている。これらの手法は、データ拡張によって生成された関連する合成入力の埋め込みの類似性を最大化することに依存する。これは埋め込みがこれらの拡張によって修正された因子、すなわちそれらに不変な要素を除外することを奨励するタスクと見なすことができる。しかし、これは拡張の選択におけるトレードオフの一面のみを考慮に入れている: 単純なソリューションのショートカット学習(例えば色ヒストグラムのみを使用する)を避けるために画像を強く修正する必要があるが、一方、拡張関連情報は下流タスクの表現に欠落している可能性がある(例えば、色は鳥や花の分類に重要である)。増大への等式を探求することによって、不変タスクのみを使用する問題を緩和する最近の研究はほとんどない。これは、追加の埋め込み空間(s)を学ぶことで実現され、いくつかの拡張は埋め込みが異なるが、制御されていない方法で行われる。本研究では,学習した潜伏空間を構成する汎用的同値モジュールであるEquiModを紹介し,加法によって生じる埋め込み空間の変位を予測することを学ぶ。このモジュールをSimCLRやBYOLといった最先端の不変モデルに適用すると,CIFAR10およびImageNetデータセットのパフォーマンスが向上することを示す。さらに、モデルが自明な等分散(すなわち不変性)に崩壊する可能性はあるが、その代わりに、表現に有益である拡張に関連する情報を自動で保持することが観察される。

論文の概要: EquiMod: An Equivariance Module to Improve Self-Supervised Learning

関連論文リスト