Prototype-Guided Curriculum Learning for Zero-Shot Learning
- URL: http://arxiv.org/abs/2508.07771v1
- Date: Mon, 11 Aug 2025 08:56:21 GMT
- Title: Prototype-Guided Curriculum Learning for Zero-Shot Learning
- Authors: Lei Wang, Shiming Chen, Guo-Sen Xie, Ziming Hong, Chaojian Yu, Qinmu Peng, Xinge You,
- Abstract summary: We propose a prototype-guided curriculum learning framework (dubbed as CLZSL)<n>The PCL module prioritizes samples with high cosine similarity between their visual mappings and the class-level semantic prototypes.<n>The PUP module dynamically updates the class-level semantic prototypes by leveraging the visual mappings learned from instances.
- Score: 25.632658478653855
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In Zero-Shot Learning (ZSL), embedding-based methods enable knowledge transfer from seen to unseen classes by learning a visual-semantic mapping from seen-class images to class-level semantic prototypes (e.g., attributes). However, these semantic prototypes are manually defined and may introduce noisy supervision for two main reasons: (i) instance-level mismatch: variations in perspective, occlusion, and annotation bias will cause discrepancies between individual sample and the class-level semantic prototypes; and (ii) class-level imprecision: the manually defined semantic prototypes may not accurately reflect the true semantics of the class. Consequently, the visual-semantic mapping will be misled, reducing the effectiveness of knowledge transfer to unseen classes. In this work, we propose a prototype-guided curriculum learning framework (dubbed as CLZSL), which mitigates instance-level mismatches through a Prototype-Guided Curriculum Learning (PCL) module and addresses class-level imprecision via a Prototype Update (PUP) module. Specifically, the PCL module prioritizes samples with high cosine similarity between their visual mappings and the class-level semantic prototypes, and progressively advances to less-aligned samples, thereby reducing the interference of instance-level mismatches to achieve accurate visual-semantic mapping. Besides, the PUP module dynamically updates the class-level semantic prototypes by leveraging the visual mappings learned from instances, thereby reducing class-level imprecision and further improving the visual-semantic mapping. Experiments were conducted on standard benchmark datasets-AWA2, SUN, and CUB-to verify the effectiveness of our method.
Related papers
- Class-Aware Prototype Learning with Negative Contrast for Test-Time Adaptation of Vision-Language Models [48.61795272482598]
Vision-Language Models (VLMs) demonstrate impressive zero-shot generalization through large-scale image-text pretraining.<n>But their performance can drop once the deployment distribution diverges from the training distribution.<n>Test-Time Adaptation (TTA) methods update models using unlabeled target data.<n>We propose textbfClass-Aware textbfPrototype textbfL with textbfNegative textbfContrast(textbfCPL-NC), a lightweight TTA framework
arXiv Detail & Related papers (2025-10-22T17:38:35Z) - Hunting Attributes: Context Prototype-Aware Learning for Weakly
Supervised Semantic Segmentation [22.591512454923883]
We argue that the knowledge bias between instances and contexts affects the capability of the prototype to sufficiently understand instance semantics.
Inspired by prototype learning theory, we propose leveraging prototype awareness to capture diverse and fine-grained feature attributes of instances.
We present a Context Prototype-Aware Learning (CPAL) strategy, which leverages semantic context to enrich instance comprehension.
arXiv Detail & Related papers (2024-03-12T13:11:58Z) - Evolving Semantic Prototype Improves Generative Zero-Shot Learning [73.07035277030573]
In zero-shot learning (ZSL), generative methods synthesize class-related sample features based on predefined semantic prototypes.
We observe that each class's predefined semantic prototype does not accurately match its real semantic prototype.
We propose a dynamic semantic prototype evolving (DSP) method to align the empirically predefined semantic prototypes and the real prototypes for class-related feature synthesis.
arXiv Detail & Related papers (2023-06-12T08:11:06Z) - Learning Prototype via Placeholder for Zero-shot Recognition [18.204927316433448]
We propose to learn prototypes via placeholders, termed LPL, to eliminate the domain shift between seen and unseen classes.
We exploit a novel semantic-oriented fine-tuning to guarantee the semantic reliability of placeholders.
Experiments on five benchmark datasets demonstrate the significant performance gain of LPL over the state-of-the-art methods.
arXiv Detail & Related papers (2022-07-29T09:56:44Z) - Cross-modal Representation Learning for Zero-shot Action Recognition [67.57406812235767]
We present a cross-modal Transformer-based framework, which jointly encodes video data and text labels for zero-shot action recognition (ZSAR)
Our model employs a conceptually new pipeline by which visual representations are learned in conjunction with visual-semantic associations in an end-to-end manner.
Experiment results show our model considerably improves upon the state of the arts in ZSAR, reaching encouraging top-1 accuracy on UCF101, HMDB51, and ActivityNet benchmark datasets.
arXiv Detail & Related papers (2022-05-03T17:39:27Z) - Learning Semantic Ambiguities for Zero-Shot Learning [0.0]
We propose a regularization method that can be applied to any conditional generative-based ZSL method.
It learns to synthesize discriminative features for possible semantic description that are not available at training time, that is the unseen ones.
The approach is evaluated for ZSL and GZSL on four datasets commonly used in the literature.
arXiv Detail & Related papers (2022-01-05T21:08:29Z) - Dual Prototypical Contrastive Learning for Few-shot Semantic
Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task.
The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space.
We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z) - Contrastive Prototype Learning with Augmented Embeddings for Few-Shot
Learning [58.2091760793799]
We propose a novel contrastive prototype learning with augmented embeddings (CPLAE) model.
With a class prototype as an anchor, CPL aims to pull the query samples of the same class closer and those of different classes further away.
Extensive experiments on several benchmarks demonstrate that our proposed CPLAE achieves new state-of-the-art.
arXiv Detail & Related papers (2021-01-23T13:22:44Z) - Information Bottleneck Constrained Latent Bidirectional Embedding for
Zero-Shot Learning [59.58381904522967]
We propose a novel embedding based generative model with a tight visual-semantic coupling constraint.
We learn a unified latent space that calibrates the embedded parametric distributions of both visual and semantic spaces.
Our method can be easily extended to transductive ZSL setting by generating labels for unseen images.
arXiv Detail & Related papers (2020-09-16T03:54:12Z) - Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method.
PCL implicitly encodes semantic structures of the data into the learned embedding space.
PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z) - Generative Model-driven Structure Aligning Discriminative Embeddings for
Transductive Zero-shot Learning [21.181715602603436]
We propose a neural network-based model for learning a projection function which aligns the visual and semantic data in the latent space.
We show superior performance on standard benchmark datasets AWA1, AWA2, CUB, SUN, FLO, and.
We also show the efficacy of our model in the case of extremely less labelled data regime.
arXiv Detail & Related papers (2020-05-09T18:48:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.