Related papers: Generalized Fine-Grained Category Discovery with Multi-Granularity Conceptual Experts

Generalized Fine-Grained Category Discovery with Multi-Granularity Conceptual Experts

URL: http://arxiv.org/abs/2509.26227v1
Date: Tue, 30 Sep 2025 13:25:11 GMT
Title: Generalized Fine-Grained Category Discovery with Multi-Granularity Conceptual Experts
Authors: Haiyang Zheng, Nan Pu, Wenjing Li, Nicu Sebe, Zhun Zhong,
Abstract summary: Generalized Category Discovery is an open-world problem that clusters unlabeled data by leveraging knowledge from partially labeled categories.<n>Existing approaches fail to exploit multi-granularity conceptual information in visual data.<n>We propose a Multi-Granularity Experts framework that integrates multi-granularity knowledge for accurate category discovery.
Score: 81.68203255687051
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Generalized Category Discovery (GCD) is an open-world problem that clusters unlabeled data by leveraging knowledge from partially labeled categories. A key challenge is that unlabeled data may contain both known and novel categories. Existing approaches suffer from two main limitations. First, they fail to exploit multi-granularity conceptual information in visual data, which limits representation quality. Second, most assume that the number of unlabeled categories is known during training, which is impractical in real-world scenarios. To address these issues, we propose a Multi-Granularity Conceptual Experts (MGCE) framework that adaptively mines visual concepts and integrates multi-granularity knowledge for accurate category discovery. MGCE consists of two modules: (1) Dynamic Conceptual Contrastive Learning (DCCL), which alternates between concept mining and dual-level representation learning to jointly optimize feature learning and category discovery; and (2) Multi-Granularity Experts Collaborative Learning (MECL), which extends the single-expert paradigm by introducing additional experts at different granularities and by employing a concept alignment matrix for effective cross-expert collaboration. Importantly, MGCE can automatically estimate the number of categories in unlabeled data, making it suitable for practical open-world settings. Extensive experiments on nine fine-grained visual recognition benchmarks demonstrate that MGCE achieves state-of-the-art results, particularly in novel-class accuracy. Notably, even without prior knowledge of category numbers, MGCE outperforms parametric approaches that require knowing the exact number of categories, with an average improvement of 3.6\%. Code is available at https://github.com/HaiyangZheng/MGCE.

Related papers

MacNet: An End-to-End Manifold-Constrained Adaptive Clustering Network for Interpretable Whole Slide Image Classification [9.952997875404634]
Clustering-based approaches can provide explainable decision-making process but suffer from high dimension features and semantically ambiguous centroids.<n>We propose an end-to-end MIL framework that integrates Grassmann re-embedding and manifold adaptive clustering.<n> Experiments on multicentre WSI datasets demonstrate that: 1) our cluster-incorporated model achieves superior performance in both grading accuracy and interpretability; 2) end-to-end learning refines better feature representations and it requires acceptable resources.
arXiv Detail & Related papers (2026-02-16T06:43:36Z)
Generalized Category Discovery via Token Manifold Capacity Learning [11.529179734339365]
Generalized category discovery (GCD) is essential for improving deep learning models' robustness in open-world scenarios.<n>Traditional GCD methods focus on minimizing intra-cluster variations, often sacrificing manifold capacity.<n>We propose a novel approach, that prioritizes the manifold capacity of class tokens to preserve the diversity and complexity of data.
arXiv Detail & Related papers (2025-05-20T07:40:31Z)
Prototypical Hash Encoding for On-the-Fly Fine-Grained Category Discovery [65.16724941038052]
Category-aware Prototype Generation (CPG) and Discrimi Category 5.3% (DCE) are proposed.<n>CPG enables the model to fully capture the intra-category diversity by representing each category with multiple prototypes.<n>DCE boosts the discrimination ability of hash code with the guidance of the generated category prototypes.
arXiv Detail & Related papers (2024-10-24T23:51:40Z)
Contextuality Helps Representation Learning for Generalized Category Discovery [5.885208652383516]
This paper introduces a novel approach to Generalized Category Discovery (GCD) by leveraging the concept of contextuality. Our model integrates two levels of contextuality: instance-level, where nearest-neighbor contexts are utilized for contrastive learning, and cluster-level, employing contrastive learning. The integration of the contextual information effectively improves the feature learning and thereby the classification accuracy of all categories.
arXiv Detail & Related papers (2024-07-29T07:30:41Z)
RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition [78.97487780589574]
Multimodal Large Language Models (MLLMs) excel at classifying fine-grained categories. This paper introduces a Retrieving And Ranking augmented method for MLLMs. Our proposed approach not only addresses the inherent limitations in fine-grained recognition but also preserves the model's comprehensive knowledge base.
arXiv Detail & Related papers (2024-03-20T17:59:55Z)
Textual Knowledge Matters: Cross-Modality Co-Teaching for Generalized Visual Class Discovery [65.16724941038052]
Generalized Category Discovery (GCD) aims to cluster unlabeled data from both known and unknown categories.<n>Current GCD methods rely on only visual cues, which neglect the multi-modality perceptive nature of human cognitive processes in discovering novel visual categories.<n>We propose a two-phase TextGCD framework to accomplish multi-modality GCD by exploiting powerful Visual-Language Models.
arXiv Detail & Related papers (2024-03-12T07:06:50Z)
Dynamic Conceptional Contrastive Learning for Generalized Category Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data. Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories. One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data. We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z)
Parametric Information Maximization for Generalized Category Discovery [20.373038652827788]
We introduce a Parametric Information Maximization (PIM) model for the Generalized Category Discovery (GCD) problem. We show that our PIM model consistently sets new state-of-the-art performances in GCD across six different datasets.
arXiv Detail & Related papers (2022-12-01T07:41:48Z)
Semantic Representation and Dependency Learning for Multi-Label Image Recognition [76.52120002993728]
We propose a novel and effective semantic representation and dependency learning (SRDL) framework to learn category-specific semantic representation for each category. Specifically, we design a category-specific attentional regions (CAR) module to generate channel/spatial-wise attention matrices to guide model. We also design an object erasing (OE) module to implicitly learn semantic dependency among categories by erasing semantic-aware regions.
arXiv Detail & Related papers (2022-04-08T00:55:15Z)
Novel Visual Category Discovery with Dual Ranking Statistics and Mutual Knowledge Distillation [16.357091285395285]
We tackle the problem of grouping unlabelled images from new classes into different semantic partitions. This is a more realistic and challenging setting than conventional semi-supervised learning. We propose a two-branch learning framework for this problem, with one branch focusing on local part-level information and the other branch focusing on overall characteristics.
arXiv Detail & Related papers (2021-07-07T17:14:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.