Fugu-MT 論文翻訳(概要): Mutually Causal Semantic Distillation Network for Zero-Shot Learning

論文の概要: Mutually Causal Semantic Distillation Network for Zero-Shot Learning

arxiv url: http://arxiv.org/abs/2603.17412v1
Date: Wed, 18 Mar 2026 06:44:54 GMT
ステータス: 翻訳完了
システム内更新日: 2026-03-19 18:32:57.549631
Title: Mutually Causal Semantic Distillation Network for Zero-Shot Learning
Title（参考訳）: ゼロショット学習のための相互因果意味蒸留ネットワーク
Authors: Shiming Chen, Shuhuang Chen, Guo-Sen Xie, Xinge You,
Abstract要約: ゼロショット学習(ZSL)は、サイドインフォメーション(属性など)によって導かれるオープンワールドにおける見えないクラスを認識することを目的としている。その主要なタスクは、視覚的特徴と属性的特徴の間の潜在的な意味的知識を、どのように推測するかである。 ZSLの本質的かつ十分な意味表現を抽出するために,相互に基づく属性ベースセマンティックネットワーク(termed++)を提案する。
参考スコア（独自算出の注目度）: 32.25476851030761
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Zero-shot learning (ZSL) aims to recognize the unseen classes in the open-world guided by the side-information (e.g., attributes). Its key task is how to infer the latent semantic knowledge between visual and attribute features on seen classes, and thus conducting a desirable semantic knowledge transfer from seen classes to unseen ones. Prior works simply utilize unidirectional attention within a weakly-supervised manner to learn the spurious and limited latent semantic representations, which fail to effectively discover the intrinsic semantic knowledge (e.g., attribute semantic) between visual and attribute features. To solve the above challenges, we propose a mutually causal semantic distillation network (termed MSDN++) to distill the intrinsic and sufficient semantic representations for ZSL. MSDN++ consists of an attribute$\rightarrow$visual causal attention sub-net that learns attribute-based visual features, and a visual$\rightarrow$attribute causal attention sub-net that learns visual-based attribute features. The causal attentions encourages the two sub-nets to learn causal vision-attribute associations for representing reliable features with causal visual/attribute learning. With the guidance of semantic distillation loss, the two mutual attention sub-nets learn collaboratively and teach each other throughout the training process. Extensive experiments on three widely-used benchmark datasets (e.g., CUB, SUN, AWA2, and FLO) show that our MSDN++ yields significant improvements over the strong baselines, leading to new state-of-the-art performances.
Abstract（参考訳）: ゼロショット学習(ZSL)は、サイドインフォメーション(例えば属性)によって導かれるオープンワールドにおける見えないクラスを認識することを目的としている。その重要なタスクは、視覚的特徴と属性的特徴の間に潜伏した意味的知識を推論する方法であり、それによって、見知らぬクラスから見つからないクラスへ望ましい意味的知識を伝達する。先行研究は、視覚的特徴と属性的特徴の間の本質的な意味的知識(例えば属性意味)を効果的に発見できない、刺激的で限定的な潜在意味的表現を学ぶために、弱教師付き方法で一方向の注意を単純に活用する。以上の課題を解決するために,ZSLの本質的かつ十分な意味表現を蒸留するための相互因果的意味蒸留ネットワーク(MSDN++)を提案する。 MSDN++ は属性ベースの視覚特徴を学習する属性$\rightarrow$visual causal attention sub-net と、視覚ベースの属性特徴を学習する視覚$\rightarrow$attribute causal attention sub-net で構成されている。因果的注意は、2つのサブネットに因果的視覚・属性関連を学習させ、因果的視覚・属性学習による信頼性のある特徴を表現する。セマンティック蒸留損失のガイダンスにより、2つの相互注意サブネットは協調的に学習し、トレーニングプロセスを通して相互に教え合う。広く使用されている3つのベンチマークデータセット(例えば、CUB、SUN、AWA2、FLO)に対する大規模な実験は、MSDN++が強力なベースラインよりも大幅に改善され、新たな最先端のパフォーマンスがもたらされることを示している。

論文の概要: Mutually Causal Semantic Distillation Network for Zero-Shot Learning

関連論文リスト