Fugu-MT 論文翻訳(概要): Understanding Masked Autoencoders From a Local Contrastive Perspective

論文の概要: Understanding Masked Autoencoders From a Local Contrastive Perspective

arxiv url: http://arxiv.org/abs/2310.01994v2
Date: Fri, 8 Dec 2023 08:07:29 GMT
ステータス: 翻訳完了
システム内更新日: 2023-12-11 18:14:53.328854
Title: Understanding Masked Autoencoders From a Local Contrastive Perspective
Title（参考訳）: 局所的コントラストからみたマスクオートエンコーダの理解
Authors: Xiaoyu Yue, Lei Bai, Meng Wei, Jiangmiao Pang, Xihui Liu, Luping Zhou, Wanli Ouyang
Abstract要約: Masked AutoEncoder (MAE)は、シンプルだが効果的なマスキングと再構築戦略によって、自己指導型学習の分野に革命をもたらした。そこで我々は,MaEの再構成的側面とコントラスト的側面の両方を解析するために,ローカルコントラストMAEと呼ばれる新しい経験的枠組みを導入する。
参考スコア（独自算出の注目度）: 80.57196495601826
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies. However, despite achieving state-of-the-art performance across various downstream vision tasks, the underlying mechanisms that drive MAE's efficacy are less well-explored compared to the canonical contrastive learning paradigm. In this paper, we first propose a local perspective to explicitly extract a local contrastive form from MAE's reconstructive objective at the patch level. And then we introduce a new empirical framework, called Local Contrastive MAE (LC-MAE), to analyze both reconstructive and contrastive aspects of MAE. LC-MAE reveals that MAE learns invariance to random masking and ensures distribution consistency between the learned token embeddings and the original images. Furthermore, we dissect the contribution of the decoder and random masking to MAE's success, revealing both the decoder's learning mechanism and the dual role of random masking as data augmentation and effective receptive field restriction. Our experimental analysis sheds light on the intricacies of MAE and summarizes some useful design methodologies, which can inspire more powerful visual self-supervised methods.
Abstract（参考訳）: Masked AutoEncoder (MAE)は、シンプルだが効果的なマスキングと再構築戦略によって、自己指導型学習の分野に革命をもたらした。しかし、様々なダウンストリーム視覚タスクにおける最先端性能を達成しているにもかかわらず、MAEの有効性を駆動する基盤メカニズムは、標準的なコントラッシブ学習パラダイムに比べてあまりよく研究されていない。本稿では,まず,パッチレベルでのmaeの再構成目標から局所的なコントラスト形式を明示的に抽出する局所的視点を提案する。そこで我々は, LC-MAE (Local Contrastive MAE) と呼ばれる新しい経験的枠組みを導入し, 再建的側面とコントラスト的側面の両方を解析した。 LC-MAEは、MAEがランダムマスキングの不変性を学習し、学習したトークンの埋め込みと元の画像との分布整合性を保証する。さらに,デコーダの学習機構とランダムマスキングの2つの役割をデータ拡張と効果的な受容場制限として明らかにし,maeの成功へのデコーダとランダムマスキングの寄与を解析した。実験では,MAEの複雑さに光を当て,より強力な視覚的自己管理手法を刺激する有用な設計手法をまとめた。

関連論文リスト

Learning Mask Invariant Mutual Information for Masked Image Modeling [35.63719638508299]
Maskedencodes (MAEs) はコンピュータビジョンにおける卓越した自己教師型学習パラダイムである。近年の研究では、コントラスト学習と特徴表現分析を通じて、MAEの機能の解明が試みられている。本稿では,情報理論における情報ボトルネックの原理を活用することで,MAEを理解するための新たな視点を提案する。
論文参考訳（メタデータ） (2025-02-27T03:19:05Z)
Bringing Masked Autoencoders Explicit Contrastive Properties for Point Cloud Self-Supervised Learning [116.75939193785143]
画像領域における視覚変換器(ViT)のコントラスト学習(CL)は、従来の畳み込みバックボーンのCLに匹敵する性能を達成した。 ViTで事前訓練した3Dポイントクラウドでは、マスク付きオートエンコーダ(MAE)モデリングが主流である。
論文参考訳（メタデータ） (2024-07-08T12:28:56Z)
Masking Improves Contrastive Self-Supervised Learning for ConvNets, and Saliency Tells You Where [63.61248884015162]
我々は、畳み込みニューラルネットワークのためのコントラスト学習フレームワークにマスキング操作を組み込むことの負担を軽減することを目的としている。マスクされた領域が、前景と背景の間に均等に分散されていることを考慮し、塩分濃度の制約を明示的に考慮することを提案する。
論文参考訳（メタデータ） (2023-09-22T09:58:38Z)
Understanding Masked Autoencoders via Hierarchical Latent Variable Models [109.35382136147349]
Masked Autoencoder (MAE) は近年,様々な視覚タスクにおいて顕著な成功を収めている。 MAEに関する興味深い経験的観察の出現にもかかわらず、理論的に原理化された理解はいまだに欠如している。
論文参考訳（メタデータ） (2023-06-08T03:00:10Z)
i-MAE: Are Latent Representations in Masked Autoencoders Linearly Separable? [26.146459754995597]
マスク付き画像モデリング(MIM)は視覚領域における自己監督型事前学習の強力なアプローチとして認識されている。本稿では,表現能力を高めるために,インタラクティブなMasked Autoencoders (i-MAE) フレームワークを提案する。潜在表現の特徴を質的に解析することに加えて,線形分離性の存在と潜在空間における意味論の程度について検討する。
論文参考訳（メタデータ） (2022-10-20T17:59:54Z)
How Mask Matters: Towards Theoretical Understandings of Masked Autoencoders [21.849681446573257]
再構成タスクに基づくマスケ自動エンコーダ(MAE)は、自己教師型学習(SSL)の有望なパラダイムになってきた。本稿では,MAEが意味のある特徴を学習する上で,マスキングがいかに重要であるかを理論的に理解する。
論文参考訳（メタデータ） (2022-10-15T17:36:03Z)
Exploring The Role of Mean Teachers in Self-supervised Masked Auto-Encoders [64.03000385267339]
マスク付き画像モデリング(MIM)は視覚変換器を用いた視覚表現の自己教師型学習(SSL)の一般的な戦略となっている。簡単なSSL方式であるRC-MAE(Restruction-Consistent Masked Auto-Encoder)を提案する。 RC-MAEは、事前学習中に最先端の自己蒸留法よりも早く収束し、メモリ使用量の削減を必要とする。
論文参考訳（メタデータ） (2022-10-05T08:08:55Z)
MAML is a Noisy Contrastive Learner [72.04430033118426]
モデルに依存しないメタラーニング(MAML)は、今日では最も人気があり広く採用されているメタラーニングアルゴリズムの1つである。我々は、MAMLの動作メカニズムに対する新たな視点を提供し、以下に示すように、MAMLは、教師付きコントラスト目的関数を用いたメタラーナーに類似している。このような干渉を軽減するため, 単純だが効果的な手法であるゼロ化手法を提案する。
論文参考訳（メタデータ） (2021-06-29T12:52:26Z)

関連論文リストは本サイト内にある論文のタイトル・アブストラクトから自動的に作成しています。

指定された論文の情報です。
本サイトの運営者は本サイト（すべての情報・翻訳含む）の品質を保証せず、本サイト（すべての情報・翻訳含む）を使用して発生したあらゆる結果について一切の責任を負いません。