ImbaGCD: Imbalanced Generalized Category Discovery
- URL: http://arxiv.org/abs/2401.05353v1
- Date: Mon, 4 Dec 2023 09:46:09 GMT
- Title: ImbaGCD: Imbalanced Generalized Category Discovery
- Authors: Ziyun Li, Ben Dai, Furkan Simsek, Christoph Meinel, Haojin Yang
- Abstract summary: Generalized class discovery (GCD) aims to infer known and unknown categories in an unlabeled dataset.
In nature, we are more likely to encounter known/common classes than unknown/uncommon ones.
We present ImbaGCD, a novel optimal transport-based expectation framework that accomplishes generalized category discovery.
- Score: 8.905027373521213
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Generalized class discovery (GCD) aims to infer known and unknown categories
in an unlabeled dataset leveraging prior knowledge of a labeled set comprising
known classes. Existing research implicitly/explicitly assumes that the
frequency of occurrence for each category, whether known or unknown, is
approximately the same in the unlabeled data. However, in nature, we are more
likely to encounter known/common classes than unknown/uncommon ones, according
to the long-tailed property of visual classes. Therefore, we present a
challenging and practical problem, Imbalanced Generalized Category Discovery
(ImbaGCD), where the distribution of unlabeled data is imbalanced, with known
classes being more frequent than unknown ones. To address these issues, we
propose ImbaGCD, A novel optimal transport-based expectation maximization
framework that accomplishes generalized category discovery by aligning the
marginal class prior distribution. ImbaGCD also incorporates a systematic
mechanism for estimating the imbalanced class prior distribution under the GCD
setup. Our comprehensive experiments reveal that ImbaGCD surpasses previous
state-of-the-art GCD methods by achieving an improvement of approximately 2 -
4% on CIFAR-100 and 15 - 19% on ImageNet-100, indicating its superior
effectiveness in solving the Imbalanced GCD problem.
Related papers
- Happy: A Debiased Learning Framework for Continual Generalized Category Discovery [54.54153155039062]
This paper explores the underexplored task of Continual Generalized Category Discovery (C-GCD)
C-GCD aims to incrementally discover new classes from unlabeled data while maintaining the ability to recognize previously learned classes.
We introduce a debiased learning framework, namely Happy, characterized by Hardness-aware prototype sampling and soft entropy regularization.
arXiv Detail & Related papers (2024-10-09T04:18:51Z) - Active Generalized Category Discovery [60.69060965936214]
Generalized Category Discovery (GCD) endeavors to cluster unlabeled samples from both novel and old classes.
We take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD)
Our method achieves state-of-the-art performance on both generic and fine-grained datasets.
arXiv Detail & Related papers (2024-03-07T07:12:24Z) - Generalized Categories Discovery for Long-tailed Recognition [8.69033435074757]
Generalized Class Discovery plays a pivotal role in discerning both known and unknown categories from unlabeled datasets.
Our research endeavors to bridge this disconnect by focusing on the long-tailed Generalized Category Discovery (Long-tailed GCD) paradigm.
In response to the unique challenges posed by Long-tailed GCD, we present a robust methodology anchored in two strategic regularizations.
arXiv Detail & Related papers (2023-12-04T09:21:30Z) - Towards Distribution-Agnostic Generalized Category Discovery [51.52673017664908]
Data imbalance and open-ended distribution are intrinsic characteristics of the real visual world.
We propose a Self-Balanced Co-Advice contrastive framework (BaCon)
BaCon consists of a contrastive-learning branch and a pseudo-labeling branch, working collaboratively to provide interactive supervision to resolve the DA-GCD task.
arXiv Detail & Related papers (2023-10-02T17:39:58Z) - OpenGCD: Assisting Open World Recognition with Generalized Category
Discovery [4.600906853436266]
A desirable open world recognition (OWR) system requires performing three tasks.
We propose OpenGCD that combines three key ideas to solve the above problems sequentially.
Experiments on two standard classification benchmarks and a challenging dataset demonstrate that OpenGCD not only offers excellent compatibility but also substantially outperforms other baselines.
arXiv Detail & Related papers (2023-08-14T04:10:45Z) - Dynamic Conceptional Contrastive Learning for Generalized Category
Discovery [76.82327473338734]
Generalized category discovery (GCD) aims to automatically cluster partially labeled data.
Unlabeled data contain instances that are not only from known categories of the labeled data but also from novel categories.
One effective way for GCD is applying self-supervised learning to learn discriminate representation for unlabeled data.
We propose a Dynamic Conceptional Contrastive Learning framework, which can effectively improve clustering accuracy.
arXiv Detail & Related papers (2023-03-30T14:04:39Z) - Parametric Classification for Generalized Category Discovery: A Baseline
Study [70.73212959385387]
Generalized Category Discovery (GCD) aims to discover novel categories in unlabelled datasets using knowledge learned from labelled samples.
We investigate the failure of parametric classifiers, verify the effectiveness of previous design choices when high-quality supervision is available, and identify unreliable pseudo-labels as a key problem.
We propose a simple yet effective parametric classification method that benefits from entropy regularisation, achieves state-of-the-art performance on multiple GCD benchmarks and shows strong robustness to unknown class numbers.
arXiv Detail & Related papers (2022-11-21T18:47:11Z) - Classify and Generate Reciprocally: Simultaneous Positive-Unlabelled
Learning and Conditional Generation with Extra Data [77.31213472792088]
The scarcity of class-labeled data is a ubiquitous bottleneck in many machine learning problems.
We address this problem by leveraging Positive-Unlabeled(PU) classification and the conditional generation with extra unlabeled data.
We present a novel training framework to jointly target both PU classification and conditional generation when exposed to extra data.
arXiv Detail & Related papers (2020-06-14T08:27:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.