Fugu-MT 論文翻訳(概要): Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

論文の概要: Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

arxiv url: http://arxiv.org/abs/2205.13092v3
Date: Sat, 27 Jan 2024 09:24:46 GMT
ステータス: 翻訳完了
システム内更新日: 2024-01-31 01:19:37.960039
Title: Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels
Title（参考訳）: 部分ラベル付きマルチラベル画像認識のためのデュアル・パースペクティブ・セマンティクス・アウェア表現ブレンド
Authors: Tao Pu, Tianshui Chen, Hefeng Wu, Yukai Shi, Zhijing Yang, Liang Lin
Abstract要約: 本稿では,多粒度カテゴリ固有の意味表現を異なる画像にブレンドした,二重パースペクティブな意味認識表現ブレンディング(DSRB)を提案する。提案したDSは、すべての比率ラベル設定において、最先端のアルゴリズムを一貫して上回っている。
参考スコア（独自算出の注目度）: 70.36722026729859
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Despite achieving impressive progress, current multi-label image recognition (MLR) algorithms heavily depend on large-scale datasets with complete labels, making collecting large-scale datasets extremely time-consuming and labor-intensive. Training the multi-label image recognition models with partial labels (MLR-PL) is an alternative way, in which merely some labels are known while others are unknown for each image. However, current MLP-PL algorithms rely on pre-trained image similarity models or iteratively updating the image classification models to generate pseudo labels for the unknown labels. Thus, they depend on a certain amount of annotations and inevitably suffer from obvious performance drops, especially when the known label proportion is low. To address this dilemma, we propose a dual-perspective semantic-aware representation blending (DSRB) that blends multi-granularity category-specific semantic representation across different images, from instance and prototype perspective respectively, to transfer information of known labels to complement unknown labels. Specifically, an instance-perspective representation blending (IPRB) module is designed to blend the representations of the known labels in an image with the representations of the corresponding unknown labels in another image to complement these unknown labels. Meanwhile, a prototype-perspective representation blending (PPRB) module is introduced to learn more stable representation prototypes for each category and blends the representation of unknown labels with the prototypes of corresponding labels, in a location-sensitive manner, to complement these unknown labels. Extensive experiments on the MS-COCO, Visual Genome, and Pascal VOC 2007 datasets show that the proposed DSRB consistently outperforms current state-of-the-art algorithms on all known label proportion settings.
Abstract（参考訳）: 目覚ましい進歩にもかかわらず、現在のマルチラベル画像認識(MLR)アルゴリズムは、完全なラベルを持つ大規模なデータセットに大きく依存しているため、大規模なデータセットの収集は非常に時間がかかり、労力がかかる。部分ラベル付きマルチラベル画像認識モデル(MLR-PL)の訓練は、一部のラベルのみが知られ、他のラベルは各画像について不明である別の方法である。しかし、現在のmlp-plアルゴリズムは、事前訓練された画像類似性モデルに依存するか、画像分類モデルを反復的に更新して未知ラベルの擬似ラベルを生成する。したがって、一定の量のアノテーションに依存し、特に既知のラベル比率が低い場合、必然的にパフォーマンス低下に悩まされる。このジレンマに対処するために、未知のラベルを補うために既知のラベルの情報を転送するために、異なる画像間で多粒度カテゴリ固有の意味表現をブレンドする二重パースペクティブな意味認識表現ブレンディング(DSRB)を提案する。特に、IPRBモジュールは、既知のラベルの表現と対応する未知のラベルの表現を別の画像にブレンドして、これらの未知のラベルを補完するように設計されている。一方、各カテゴリのより安定した表現プロトタイプを学習するために、PPRBモジュールを導入し、未知ラベルの表現と対応するラベルのプロトタイプを、位置情報に敏感な方法でブレンドして、これらの未知ラベルを補完する。 MS-COCO、Visual Genome、Pascal VOC 2007データセットに対する大規模な実験により、提案されたDSRBは、既知のすべてのラベルの比率設定において、常に最先端のアルゴリズムより優れていることが示された。

論文の概要: Dual-Perspective Semantic-Aware Representation Blending for Multi-Label Image Recognition with Partial Labels

関連論文リスト