Fugu-MT 論文翻訳(概要): Prompt-Based Continual Compositional Zero-Shot Learning

論文の概要: Prompt-Based Continual Compositional Zero-Shot Learning

arxiv url: http://arxiv.org/abs/2512.09172v2
Date: Wed, 17 Dec 2025 12:41:30 GMT
ステータス: 翻訳完了
システム内更新日: 2025-12-18 15:03:26.905246
Title: Prompt-Based Continual Compositional Zero-Shot Learning
Title（参考訳）: プロンプトに基づく連続合成ゼロショット学習
Authors: Sauda Maryam, Sara Nadeem, Faisal Qureshi, Mohsen Ali,
Abstract要約: 合成ゼロショット学習(CZSL)における視覚言語モデルの新たな属性、オブジェクト、およびそれらの構成への継続的な適応に取り組む。クラスが結合しない古典的な連続学習とは異なり、CCZSLは属性やオブジェクトがセッション間で再起し、構成は独特なままである。凍結したVLMバックボーン上に構築されたPmptベースの連続合成ゼロショット学習フレームワークを提案する。
参考スコア（独自算出の注目度）: 4.672326975246762
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We tackle continual adaptation of vision-language models to new attributes, objects, and their compositions in Compositional Zero-Shot Learning (CZSL), while preventing forgetting of prior knowledge. Unlike classical continual learning where classes are disjoint, CCZSL is more complex as attributes and objects may reoccur across sessions while compositions remain unique. Built on a frozen VLM backbone, we propose the first Prompt-based Continual Compositional Zero-Shot Learning (PromptCCZSL) framework that retains prior knowledge through recency-weighted multi-teacher distillation. It employs session-aware compositional prompts to fuse multimodal features for new compositions, while attribute and object prompts are learned through session-agnostic fusion to maintain global semantic consistency, which is further stabilized by a Cosine Anchor Loss (CAL) to preserve prior knowledge. To enhance adaptation in the current session, an Orthogonal Projection Loss (OPL) ensures that new attribute and object embeddings remain distinct from previous ones, preventing overlap, while an Intra-Session Diversity Loss (IDL) promotes variation among current-session embeddings for richer, more discriminative representations. We also introduce a comprehensive protocol that jointly measures catastrophic forgetting and compositional generalization. Extensive experiments on UT-Zappos and C-GQA benchmarks demonstrate that PromptCCZSL achieves substantial improvements over prior VLM-based and non-VLM baselines, setting a new benchmark for CCZSL in closed-world settings.
Abstract（参考訳）: 本研究では,CZSLにおける視覚言語モデルの新たな属性,オブジェクト,およびそれらの構成への継続的な適応に取り組み,事前知識の忘れを防止した。クラスが結合しない古典的な連続学習とは異なり、CCZSLは属性やオブジェクトがセッション間で再起し、構成は独特なままである。凍結したVLMバックボーン上に構築したPmpt-based Continual Compositional Zero-Shot Learning (PromptCCZSL) フレームワークを提案する。セッションアウェアなコンポジションプロンプトを使用して、新しいコンポジションにマルチモーダルな特徴を融合する一方、属性とオブジェクトプロンプトは、グローバルなセマンティック一貫性を維持するためにセッション非依存の融合を通じて学習される。現在のセッションの適応性を高めるために、直交射影損失(OPL)は、新しい属性とオブジェクトの埋め込みが以前のものと異なっていることを保証し、重複を防止し、一方、セッション内多様性損失(IDL)はよりリッチで差別的な表現のために、現在のセッションの埋め込みの変化を促進する。また、破滅的な忘れと構成の一般化を共同で測定する包括的プロトコルも導入する。 UT-ZapposとC-GQAベンチマークの大規模な実験により、PromptCCZSLはVLMベースおよび非VLMベースラインよりも大幅に改善され、CCZSLの新しいベンチマークがクローズドワールド設定で設定された。

論文の概要: Prompt-Based Continual Compositional Zero-Shot Learning

関連論文リスト