Related papers: Learning Clustering-based Prototypes for Compositional Zero-shot Learning

Learning Clustering-based Prototypes for Compositional Zero-shot Learning

URL: http://arxiv.org/abs/2502.06501v1
Date: Mon, 10 Feb 2025 14:20:01 GMT
Title: Learning Clustering-based Prototypes for Compositional Zero-shot Learning
Authors: Hongyu Qu, Jianan Wei, Xiangbo Shu, Wenguan Wang,
Abstract summary: ClusPro is a robust clustering-based prototype mining framework for Compositional Zero-Shot Learning.<n>It defines the conceptual boundaries of primitives through a set of diversified prototypes.<n>ClusPro efficiently performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters.
Score: 56.57299428499455
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Learning primitive (i.e., attribute and object) concepts from seen compositions is the primary challenge of Compositional Zero-Shot Learning (CZSL). Existing CZSL solutions typically rely on oversimplified data assumptions, e.g., modeling each primitive with a single centroid primitive representation, ignoring the natural diversities of the attribute (resp. object) when coupled with different objects (resp. attribute). In this work, we develop ClusPro, a robust clustering-based prototype mining framework for CZSL that defines the conceptual boundaries of primitives through a set of diversified prototypes. Specifically, ClusPro conducts within-primitive clustering on the embedding space for automatically discovering and dynamically updating prototypes. These representative prototypes are subsequently used to repaint a well-structured and independent primitive embedding space, ensuring intra-primitive separation and inter-primitive decorrelation through prototype-based contrastive learning and decorrelation learning. Moreover, ClusPro efficiently performs prototype clustering in a non-parametric fashion without the introduction of additional learnable parameters or computational budget during testing. Experiments on three benchmarks demonstrate ClusPro outperforms various top-leading CZSL solutions under both closed-world and open-world settings.

Related papers

Probabilistic Prototype Calibration of Vision-Language Models for Generalized Few-shot Semantic Segmentation [75.18058114915327]
Generalized Few-Shot Semanticnative (GFSS) aims to extend a segmentation model to novel classes with only a few annotated examples.<n>We propose FewCLIP, a probabilistic prototype calibration framework over multi-modal prototypes from the pretrained CLIP.<n>We show FewCLIP significantly outperforms state-of-the-art approaches across both GFSS and class-incremental setting.
arXiv Detail & Related papers (2025-06-28T18:36:22Z)
EVA: Mixture-of-Experts Semantic Variant Alignment for Compositional Zero-Shot Learning [31.95599022275838]
We propose EVA, a Mixture-of-Experts Semantic Variant Alignment framework for Compositional Zero-Shot Learning (CZSL)<n>Specifically, we introduce domain-expert adaption, leveraging multiple experts to achieve token-aware learning and model high-quality primitive representations.<n>Our method significantly outperforms other state-of-the-art CZSL methods on three popular benchmarks in both closed- and open-world settings.
arXiv Detail & Related papers (2025-06-26T04:00:55Z)
Few-Shot Inspired Generative Zero-Shot Learning [14.66239393852298]
Generative zero-shot learning (ZSL) methods typically synthesize visual features for unseen classes.<n>We propose FSIGenZ, a few-shot-inspired generative ZSL framework that reduces reliance on large-scale feature synthesis.<n>Experiments on SUN, AwA2, and CUB benchmarks demonstrate that FSIGenZ achieves competitive performance using far fewer synthetic features.
arXiv Detail & Related papers (2025-06-18T02:39:36Z)
Dual-Modal Prototype Joint Learning for Compositional Zero-Shot Learning [15.183106475115583]
Compositional Zero-Shot Learning (CZSL) aims to recognize novel compositions of attributes and objects by leveraging knowledge learned from seen compositions.<n>We propose a novel Dual-Modal Prototype Joint Learning framework for the CZSL task.
arXiv Detail & Related papers (2025-01-23T17:30:27Z)
FedSA: A Unified Representation Learning via Semantic Anchors for Prototype-based Federated Learning [4.244188591221394]
We propose a novel framework named Federated Learning via Semantic Anchors (FedSA) to decouple the generation of prototypes from local representation learning.<n>FedSA significantly outperforms existing prototype-based FL methods on various classification tasks.
arXiv Detail & Related papers (2025-01-09T16:10:03Z)
Cross-composition Feature Disentanglement for Compositional Zero-shot Learning [49.919635694894204]
Disentanglement of visual features of primitives (i.e., attributes and objects) has shown exceptional results in Compositional Zero-shot Learning (CZSL) We propose the solution of cross-composition feature disentanglement, which takes multiple primitive-sharing compositions as inputs and constrains the disentangled primitive features to be general across these compositions.
arXiv Detail & Related papers (2024-08-19T08:23:09Z)
Rethinking Few-shot 3D Point Cloud Semantic Segmentation [62.80639841429669]
This paper revisits few-shot 3D point cloud semantic segmentation (FS-PCS) We focus on two significant issues in the state-of-the-art: foreground leakage and sparse point distribution. To address these issues, we introduce a standardized FS-PCS setting, upon which a new benchmark is built.
arXiv Detail & Related papers (2024-03-01T15:14:47Z)
ProCC: Progressive Cross-primitive Compatibility for Open-World Compositional Zero-Shot Learning [29.591615811894265]
Open-World Compositional Zero-shot Learning (OW-CZSL) aims to recognize novel compositions of state and object primitives in images with no priors on the compositional space. We propose a novel method, termed Progressive Cross-primitive Compatibility (ProCC), to mimic the human learning process for OW-CZSL tasks.
arXiv Detail & Related papers (2022-11-19T10:09:46Z)
Exploring Non-Contrastive Representation Learning for Deep Clustering [23.546602131801205]
Non-contrastive representation learning for deep clustering, termed NCC, is based on BYOL, a representative method without negative examples. NCC forms an embedding space where all clusters are well-separated and within-cluster examples are compact. Experimental results on several clustering benchmark datasets including ImageNet-1K demonstrate that NCC outperforms the state-of-the-art methods by a significant margin.
arXiv Detail & Related papers (2021-11-23T12:21:53Z)
Dual Prototypical Contrastive Learning for Few-shot Semantic Segmentation [55.339405417090084]
We propose a dual prototypical contrastive learning approach tailored to the few-shot semantic segmentation (FSS) task. The main idea is to encourage the prototypes more discriminative by increasing inter-class distance while reducing intra-class distance in prototype feature space. We demonstrate that the proposed dual contrastive learning approach outperforms state-of-the-art FSS methods on PASCAL-5i and COCO-20i datasets.
arXiv Detail & Related papers (2021-11-09T08:14:50Z)
Structure-Aware Feature Generation for Zero-Shot Learning [108.76968151682621]
We introduce a novel structure-aware feature generation scheme, termed as SA-GAN, to account for the topological structure in learning both the latent space and the generative networks. Our method significantly enhances the generalization capability on unseen-classes and consequently improve the classification performance.
arXiv Detail & Related papers (2021-08-16T11:52:08Z)
Zero-Shot Learning from Adversarial Feature Residual to Compact Visual Feature [26.89763840782029]
We propose a novel adversarial network to synthesize compact semantic visual features for zero-shot learning (ZSL) The residual generator is to generate the visual feature residual, which is integrated with a visual prototype predicted via the prototype predictor. The discriminator is to distinguish the synthetic visual features from the real ones extracted from an existing categorization CNN.
arXiv Detail & Related papers (2020-08-29T11:16:11Z)
Prototypical Contrastive Learning of Unsupervised Representations [171.3046900127166]
Prototypical Contrastive Learning (PCL) is an unsupervised representation learning method. PCL implicitly encodes semantic structures of the data into the learned embedding space. PCL outperforms state-of-the-art instance-wise contrastive learning methods on multiple benchmarks.
arXiv Detail & Related papers (2020-05-11T09:53:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.