Fugu-MT 論文翻訳(概要): Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning

論文の概要: Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning

arxiv url: http://arxiv.org/abs/2108.05617v1
Date: Thu, 12 Aug 2021 09:14:44 GMT
ステータス: 翻訳完了
システム内更新日: 2021-08-13 14:33:55.786146
Title: Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning
Title（参考訳）: 宝物へのトラッシュ:オープンセットセミスーパービジョン学習のためのクロスモーダルマッチングを用いたOODデータのハーベスティング
Authors: Junkai Huang, Chaowei Fang, Weikai Chen, Zhenhua Chai, Xiaolin Wei, Pengxu Wei, Liang Lin, Guanbin Li
Abstract要約: オープンセット半教師付き学習(Open-set SSL)では、ラベルなしデータにOOD(Out-of-distribution)サンプルを含む、難しいが実用的なシナリオを調査する。我々は、OODデータの存在を効果的に活用し、特徴学習を増強する新しいトレーニングメカニズムを提案する。我々のアプローチは、オープンセットSSLのパフォーマンスを大幅に向上させ、最先端技術よりも大きなマージンで性能を向上します。
参考スコア（独自算出の注目度）: 101.28281124670647
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Open-set semi-supervised learning (open-set SSL) investigates a challenging but practical scenario where out-of-distribution (OOD) samples are contained in the unlabeled data. While the mainstream technique seeks to completely filter out the OOD samples for semi-supervised learning (SSL), we propose a novel training mechanism that could effectively exploit the presence of OOD data for enhanced feature learning while avoiding its adverse impact on the SSL. We achieve this goal by first introducing a warm-up training that leverages all the unlabeled data, including both the in-distribution (ID) and OOD samples. Specifically, we perform a pretext task that enforces our feature extractor to obtain a high-level semantic understanding of the training images, leading to more discriminative features that can benefit the downstream tasks. Since the OOD samples are inevitably detrimental to SSL, we propose a novel cross-modal matching strategy to detect OOD samples. Instead of directly applying binary classification, we train the network to predict whether the data sample is matched to an assigned one-hot class label. The appeal of the proposed cross-modal matching over binary classification is the ability to generate a compatible feature space that aligns with the core classification task. Extensive experiments show that our approach substantially lifts the performance on open-set SSL and outperforms the state-of-the-art by a large margin.
Abstract（参考訳）: オープンセット半教師付き学習(Open-set SSL)では、ラベルなしデータにOOD(Out-of-distribution)サンプルを含む、難しいが実用的なシナリオを調査する。本手法は,半教師付き学習(SSL)のためのOODサンプルを完全にフィルタリングすることを目的としているが,機能学習の強化のためにOODデータの存在を効果的に活用し,SSLに対する悪影響を回避できる新たなトレーニング機構を提案する。この目標を達成するために、まず、ID(In-distriion)とOOD(OOD)の両方を含むラベルのないすべてのデータを活用するウォームアップトレーニングを導入する。具体的には、トレーニング画像の高レベルな意味理解を得るために、機能抽出子を強制するプリテキストタスクを実行し、下流タスクに有利なより識別的な特徴を導出します。 OODサンプルはSSLに対して必然的に有害であるため,OODサンプルを検出するための新たなクロスモーダルマッチング戦略を提案する。バイナリ分類を直接適用する代わりに、データサンプルが割り当てられた1ホットクラスラベルと一致するかどうかを予測するためにネットワークをトレーニングする。二項分類に対するクロスモーダルマッチングの提案の魅力は、コア分類タスクと整合する互換性のある特徴空間を生成する能力である。大規模な実験により,オープンセットSSLの性能は大幅に向上し,最先端技術よりも高い性能を示した。

論文の概要: Trash to Treasure: Harvesting OOD Data with Cross-Modal Matching for Open-Set Semi-Supervised Learning

関連論文リスト