Fugu-MT 論文翻訳(概要): Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model

論文の概要: Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model

arxiv url: http://arxiv.org/abs/2605.05910v1
Date: Thu, 07 May 2026 09:20:42 GMT
ステータス: 翻訳完了
システム内更新日: 2026-05-08 22:27:11.659661
Title: Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model
Title（参考訳）: 視覚言語モデルを用いたプロンプト学習のためのプラグイン・アンド・プレイ型知識注入
Authors: Junhui Yin, Nan Pu, Xinyu Zhang, Lingfeng Yang, Lin Wu, Xiaojie Wang, Zhun Zhong,
Abstract要約: そこで我々は,CAKI(Class-Aware Knowledge Injection)フレームワークを提案する。 CAKIは2つのキーコンポーネント、すなわちクラス固有のプロンプト生成とクエリキープロンプトマッチングから構成される。我々のCAKIは,既存の手法をベースクラスと新規クラスで効果的に改善する。
参考スコア（独自算出の注目度）: 46.286026005937565
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Prompt learning has become an effective and widely used technique in enhancing vision-language models (VLMs) such as CLIP for various downstream tasks, particularly in zero-shot classification within specific domains. Existing methods typically focus on either learning class-shared prompts for a given domain or generating instance-specific prompts through conditional prompt learning. While these methods have achieved promising performance, they often overlook class-specific knowledge in prompt design, leading to suboptimal outcomes. The underlying reasons are: 1) class-specific prompts offer more fine-grained supervision compared to coarse class-shared prompts, which helps prevent misclassification of data from different classes into a single class; 2) compared to class-specific prompts, instance-specific prompts neglect the richer class-level information across multiple instances, potentially causing data from the same class to be divided into multiple classes. To effectively supplement the class-specific knowledge into existing methods, we propose a plug-and-play Class-Aware Knowledge Injection (CAKI) framework. CAKI comprises two key components, i.e., class-specific prompt generation and query-key prompt matching. The former encodes class-specific knowledge into prompts from few-shot samples that belong to the same class and stores the learned prompts in a class-level knowledge bank. The latter provides a plug-and-play mechanism for each test instance to retrieve relevant class-level knowledge from the knowledge bank and inject such knowledge to refine model predictions. Extensive experiments demonstrate that our CAKI effectively improves the performance of existing methods on base and novel classes. Code is publicly available at \href{https://github.com/yjh576/CAKI}{this https URL}.
Abstract（参考訳）: プロンプト学習は、様々な下流タスクのためのCLIPのような視覚言語モデル(VLM)の強化、特に特定のドメイン内のゼロショット分類において、効果的で広く使われている技術となっている。既存のメソッドは通常、あるドメインのクラス共有プロンプトを学習するか、条件付きプロンプト学習を通じてインスタンス固有のプロンプトを生成するかに重点を置いている。これらの手法は有望な性能を達成したが、しばしばクラス固有の知識を即興設計で見落とし、最適以下の結果をもたらす。根本的な理由は次のとおりである。 1) クラス固有のプロンプトは、粗いクラス共有プロンプトよりもきめ細かい監督を提供する。 2) クラス固有のプロンプトと比較して、インスタンス固有のプロンプトは、複数のインスタンスにまたがるよりリッチなクラスレベルの情報を無視する。既存の手法にクラス固有の知識を効果的に補うため,我々はCAKIフレームワークを提案する。 CAKIは2つのキーコンポーネント、すなわちクラス固有のプロンプト生成とクエリキープロンプトマッチングから構成される。前者は、クラス固有の知識を、同じクラスに属する少数のサンプルからのプロンプトにエンコードし、学習したプロンプトをクラスレベルのナレッジバンクに格納する。後者は、各テストインスタンスが関連するクラスレベルの知識を知識バンクから取得し、そのような知識を注入してモデル予測を洗練するためのプラグアンドプレイ機構を提供する。大規模実験により,本研究の成果は,既存手法の性能向上に有効であることが確認された。コードは \href{https://github.com/yjh576/CAKI}{this https URL} で公開されている。

論文の概要: Plug-and-play Class-aware Knowledge Injection for Prompt Learning with Visual-Language Model

関連論文リスト