Related papers: Generalized Category Discovery via Token Manifold Capacity Learning

Generalized Category Discovery via Token Manifold Capacity Learning

URL: http://arxiv.org/abs/2505.14044v1
Date: Tue, 20 May 2025 07:40:31 GMT
Title: Generalized Category Discovery via Token Manifold Capacity Learning
Authors: Luyao Tang, Kunze Huang, Chaoqi Chen, Cheng Chen,
Abstract summary: Generalized category discovery (GCD) is essential for improving deep learning models' robustness in open-world scenarios.<n>Traditional GCD methods focus on minimizing intra-cluster variations, often sacrificing manifold capacity.<n>We propose a novel approach, that prioritizes the manifold capacity of class tokens to preserve the diversity and complexity of data.
Score: 11.529179734339365
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Generalized category discovery (GCD) is essential for improving deep learning models' robustness in open-world scenarios by clustering unlabeled data containing both known and novel categories. Traditional GCD methods focus on minimizing intra-cluster variations, often sacrificing manifold capacity, which limits the richness of intra-class representations. In this paper, we propose a novel approach, Maximum Token Manifold Capacity (MTMC), that prioritizes maximizing the manifold capacity of class tokens to preserve the diversity and complexity of data. MTMC leverages the nuclear norm of singular values as a measure of manifold capacity, ensuring that the representation of samples remains informative and well-structured. This method enhances the discriminability of clusters, allowing the model to capture detailed semantic features and avoid the loss of critical information during clustering. Through theoretical analysis and extensive experiments on coarse- and fine-grained datasets, we demonstrate that MTMC outperforms existing GCD methods, improving both clustering accuracy and the estimation of category numbers. The integration of MTMC leads to more complete representations, better inter-class separability, and a reduction in dimensional collapse, establishing MTMC as a vital component for robust open-world learning. Code is in github.com/lytang63/MTMC.

Related papers

Adversarial Fair Multi-View Clustering [7.650076926241037]
We propose an adversarial fair multi-view clustering (AFMVC) framework that integrates fairness learning into the representation learning process.<n>Our framework achieves superior fairness and competitive clustering performance compared to existing multi-view clustering and fairness-aware clustering methods.
arXiv Detail & Related papers (2025-08-06T04:07:08Z)
Unbiased Max-Min Embedding Classification for Transductive Few-Shot Learning: Clustering and Classification Are All You Need [83.10178754323955]
Few-shot learning enables models to generalize from only a few labeled examples.<n>We propose the Unbiased Max-Min Embedding Classification (UMMEC) Method, which addresses the key challenges in few-shot learning.<n>Our method significantly improves classification performance with minimal labeled data, advancing the state-of-the-art in annotatedL.
arXiv Detail & Related papers (2025-03-28T07:23:07Z)
Revisiting Self-Supervised Heterogeneous Graph Learning from Spectral Clustering Perspective [52.662463893268225]
Self-supervised heterogeneous graph learning (SHGL) has shown promising potential in diverse scenarios.<n>Existing SHGL methods encounter two significant limitations.<n>We introduce a novel framework enhanced by rank and dual consistency constraints.
arXiv Detail & Related papers (2024-12-01T09:33:20Z)
Self-Supervised Graph Embedding Clustering [70.36328717683297]
K-means one-step dimensionality reduction clustering method has made some progress in addressing the curse of dimensionality in clustering tasks. We propose a unified framework that integrates manifold learning with K-means, resulting in the self-supervised graph embedding framework.
arXiv Detail & Related papers (2024-09-24T08:59:51Z)
Cross-Modality Clustering-based Self-Labeling for Multimodal Data Classification [2.666791490663749]
Cross-Modality Clustering-based Self-Labeling ( CMCSL) CMCSL groups instances belonging to each modality in the deep feature space and then propagates known labels within the resulting clusters. Experimental evaluation conducted on 20 datasets derived from the MM-IMDb dataset.
arXiv Detail & Related papers (2024-08-05T15:43:56Z)
CDIMC-net: Cognitive Deep Incomplete Multi-view Clustering Network [53.72046586512026]
We propose a novel incomplete multi-view clustering network, called Cognitive Deep Incomplete Multi-view Clustering Network (CDIMC-net) It captures the high-level features and local structure of each view by incorporating the view-specific deep encoders and graph embedding strategy into a framework. Based on the human cognition, i.e., learning from easy to hard, it introduces a self-paced strategy to select the most confident samples for model training.
arXiv Detail & Related papers (2024-03-28T15:45:03Z)
Memetic Differential Evolution Methods for Semi-Supervised Clustering [0.8681835475119588]
We propose an extension for semi-supervised Minimum Sum-of-Squares Clustering (MSSC) problems of MDEClust. Our new framework, called S-MDEClust, represents the first memetic methodology designed to generate an optimal feasible solution.
arXiv Detail & Related papers (2024-03-07T08:37:36Z)
Sampling-enabled scalable manifold learning unveils discriminative cluster structure of high-dimensional data [8.507955301076633]
We propose a sampling-based Scalable manifold learning technique that enables Uniform and Discriminative Embedding, namely SUDE, for large-scale and high-dimensional data.<n>We empirically validated the effectiveness of SUDE on synthetic datasets and real-world benchmarks, and applied it to analyze single-cell data and detect anomalies in electrocardiogram (ECG) signals.
arXiv Detail & Related papers (2024-01-02T08:43:06Z)
Multi-View Clustering via Semi-non-negative Tensor Factorization [120.87318230985653]
We develop a novel multi-view clustering based on semi-non-negative tensor factorization (Semi-NTF) Our model directly considers the between-view relationship and exploits the between-view complementary information. In addition, we provide an optimization algorithm for the proposed method and prove mathematically that the algorithm always converges to the stationary KKT point.
arXiv Detail & Related papers (2023-03-29T14:54:19Z)
Deep Attention-guided Graph Clustering with Dual Self-supervision [49.040136530379094]
We propose a novel method, namely deep attention-guided graph clustering with dual self-supervision (DAGC) We develop a dual self-supervision solution consisting of a soft self-supervision strategy with a triplet Kullback-Leibler divergence loss and a hard self-supervision strategy with a pseudo supervision loss. Our method consistently outperforms state-of-the-art methods on six benchmark datasets.
arXiv Detail & Related papers (2021-11-10T06:53:03Z)
Deep Conditional Gaussian Mixture Model for Constrained Clustering [7.070883800886882]
Constrained clustering can leverage prior information on a growing amount of only partially labeled data. We propose a novel framework for constrained clustering that is intuitive, interpretable, and can be trained efficiently in the framework of gradient variational inference.
arXiv Detail & Related papers (2021-06-11T13:38:09Z)
Joint Optimization of an Autoencoder for Clustering and Embedding [22.16059261437617]
We present an alternative where the autoencoder and the clustering are learned simultaneously. That simple neural network, referred to as the clustering module, can be integrated into a deep autoencoder resulting in a deep clustering model.
arXiv Detail & Related papers (2020-12-07T14:38:10Z)
TCGM: An Information-Theoretic Framework for Semi-Supervised Multi-Modality Learning [35.76792527025377]
We propose a novel information-theoretic approach, namely textbfTotal textbfCorrelation textbfGain textbfMaximization (TCGM) for semi-supervised multi-modal learning. We apply our method to various tasks and achieve state-of-the-art results, including news classification, emotion recognition and disease prediction.
arXiv Detail & Related papers (2020-07-14T03:32:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.