Fugu-MT 論文翻訳(概要): Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

論文の概要: Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

arxiv url: http://arxiv.org/abs/2303.15322v1
Date: Mon, 27 Mar 2023 15:21:43 GMT
ステータス: 翻訳完了
システム内更新日: 2023-03-28 14:36:52.141520
Title: Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning
Title（参考訳）: 一般化ゼロショット学習のためのプログレッシブセマンティクスとビジュアルの相互適応
Authors: Man Liu, Feng Li, Chunjie Zhang, Yunchao Wei, Huihui Bai, Yao Zhao
Abstract要約: 一般化ゼロショット学習(GZSL)は、目に見えない領域から移行した知識によって、見えないカテゴリを特定する。プロトタイプと視覚特徴の対応性を段階的にモデル化するために,デュアルセマンティック・ビジュアル・トランスフォーマーモジュール(DSVTM)をデプロイする。 DSVTMは、インスタンス中心のプロトタイプを学習して異なる画像に適応させる、インスタンス駆動セマンティックエンコーダを考案した。
参考スコア（独自算出の注目度）: 74.48337375174297
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generalized Zero-Shot Learning (GZSL) identifies unseen categories by knowledge transferred from the seen domain, relying on the intrinsic interactions between visual and semantic information. Prior works mainly localize regions corresponding to the sharing attributes. When various visual appearances correspond to the same attribute, the sharing attributes inevitably introduce semantic ambiguity, hampering the exploration of accurate semantic-visual interactions. In this paper, we deploy the dual semantic-visual transformer module (DSVTM) to progressively model the correspondences between attribute prototypes and visual features, constituting a progressive semantic-visual mutual adaption (PSVMA) network for semantic disambiguation and knowledge transferability improvement. Specifically, DSVTM devises an instance-motivated semantic encoder that learns instance-centric prototypes to adapt to different images, enabling the recast of the unmatched semantic-visual pair into the matched one. Then, a semantic-motivated instance decoder strengthens accurate cross-domain interactions between the matched pair for semantic-related instance adaption, encouraging the generation of unambiguous visual representations. Moreover, to mitigate the bias towards seen classes in GZSL, a debiasing loss is proposed to pursue response consistency between seen and unseen predictions. The PSVMA consistently yields superior performances against other state-of-the-art methods. Code will be available at: https://github.com/ManLiuCoder/PSVMA.
Abstract（参考訳）: 一般化ゼロショット学習(GZSL)は、視覚情報と意味情報の間の本質的な相互作用に頼って、目に見えないカテゴリを、目に見えない領域から移行した知識によって識別する。以前の研究は主に共有属性に対応する領域をローカライズする。様々な視覚的外観が同じ属性に対応する場合、共有属性は必然的に意味的曖昧さを導入し、正確な意味的・視覚的相互作用の探索を妨げる。本稿では,2つの意味的視覚変換モジュール(DSVTM)を配置し,属性のプロトタイプと視覚的特徴の対応性を段階的にモデル化し,意味的曖昧さと知識伝達性向上のための意味的視覚的相互適応(PSVMA)ネットワークを構成する。具体的には、DSVTMは、インスタンス中心のプロトタイプを異なるイメージに適応させるために学習する、インスタンス駆動セマンティックエンコーダを考案した。セマンティクスモチベーションインスタンスデコーダは、マッチしたペア間の正確なクロスドメインインタラクションを、セマンティクス関連インスタンス適応のために強化し、あいまいな視覚的表現の生成を促進する。さらに,gzslにおける視クラスに対するバイアスを軽減するために,視クラスと視クラス間の応答一貫性を追求するためにデバイアス損失が提案されている。 PSVMAは、他の最先端の手法と比較して一貫して優れた性能が得られる。コードはhttps://github.com/manliucoder/psvmaで入手できる。

論文の概要: Progressive Semantic-Visual Mutual Adaption for Generalized Zero-Shot Learning

関連論文リスト