A Closer Look at Novel Class Discovery from the Labeled Set
- URL: http://arxiv.org/abs/2209.09120v2
- Date: Wed, 21 Sep 2022 10:01:53 GMT
- Title: A Closer Look at Novel Class Discovery from the Labeled Set
- Authors: Ziyun Li, Jona Otholt, Ben Dai, Di hu, Christoph Meinel, Haojin Yang
- Abstract summary: Novel class discovery (NCD) aims to infer novel categories in an unlabeled dataset leveraging prior knowledge of a labeled set comprising disjoint but related classes.
We propose and substantiate the hypothesis that NCD could benefit more from a labeled set with a large degree of semantic similarity to the unlabeled set.
Specifically, we establish an extensive and large-scale benchmark with varying degrees of semantic similarity between labeled/unlabeled datasets on ImageNet.
- Score: 13.31397670697559
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Novel class discovery (NCD) aims to infer novel categories in an unlabeled
dataset leveraging prior knowledge of a labeled set comprising disjoint but
related classes. Existing research focuses primarily on utilizing the labeled
set at the methodological level, with less emphasis on the analysis of the
labeled set itself. Thus, in this paper, we rethink novel class discovery from
the labeled set and focus on two core questions: (i) Given a specific unlabeled
set, what kind of labeled set can best support novel class discovery? (ii) A
fundamental premise of NCD is that the labeled set must be related to the
unlabeled set, but how can we measure this relation? For (i), we propose and
substantiate the hypothesis that NCD could benefit more from a labeled set with
a large degree of semantic similarity to the unlabeled set. Specifically, we
establish an extensive and large-scale benchmark with varying degrees of
semantic similarity between labeled/unlabeled datasets on ImageNet by
leveraging its hierarchical class structure. As a sharp contrast, the existing
NCD benchmarks are developed based on labeled sets with different number of
categories and images, and completely ignore the semantic relation. For (ii),
we introduce a mathematical definition for quantifying the semantic similarity
between labeled and unlabeled sets. In addition, we use this metric to confirm
the validity of our proposed benchmark and demonstrate that it highly
correlates with NCD performance. Furthermore, without quantitative analysis,
previous works commonly believe that label information is always beneficial.
However, counterintuitively, our experimental results show that using labels
may lead to sub-optimal outcomes in low-similarity settings.
Related papers
- Active Generalized Category Discovery [60.69060965936214]
Generalized Category Discovery (GCD) endeavors to cluster unlabeled samples from both novel and old classes.
We take the spirit of active learning and propose a new setting called Active Generalized Category Discovery (AGCD)
Our method achieves state-of-the-art performance on both generic and fine-grained datasets.
arXiv Detail & Related papers (2024-03-07T07:12:24Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Supervised Knowledge May Hurt Novel Class Discovery Performance [13.31397670697559]
Novel class discovery (NCD) aims to infer novel categories in an unlabeled dataset by leveraging prior knowledge of a labeled set comprising disjoint but related classes.
This paper considers the question: Is supervised knowledge always helpful at different levels of semantic relevance?
arXiv Detail & Related papers (2023-06-06T13:04:05Z) - Ambiguity-Resistant Semi-Supervised Learning for Dense Object Detection [98.66771688028426]
We propose a Ambiguity-Resistant Semi-supervised Learning (ARSL) for one-stage detectors.
Joint-Confidence Estimation (JCE) is proposed to quantifies the classification and localization quality of pseudo labels.
ARSL effectively mitigates the ambiguities and achieves state-of-the-art SSOD performance on MS COCO and PASCAL VOC.
arXiv Detail & Related papers (2023-03-27T07:46:58Z) - Exploring Structured Semantic Prior for Multi Label Recognition with
Incomplete Labels [60.675714333081466]
Multi-label recognition (MLR) with incomplete labels is very challenging.
Recent works strive to explore the image-to-label correspondence in the vision-language model, ie, CLIP, to compensate for insufficient annotations.
We advocate remedying the deficiency of label supervision for the MLR with incomplete labels by deriving a structured semantic prior.
arXiv Detail & Related papers (2023-03-23T12:39:20Z) - Group is better than individual: Exploiting Label Topologies and Label
Relations for Joint Multiple Intent Detection and Slot Filling [39.76268402567324]
We construct a Heterogeneous Label Graph (HLG) containing two kinds of topologies.
Label correlations are leveraged to enhance semantic-label interactions.
We also propose the label-aware inter-dependent decoding mechanism to further exploit the label correlations for decoding.
arXiv Detail & Related papers (2022-10-19T08:21:43Z) - Expert Knowledge-Guided Length-Variant Hierarchical Label Generation for
Proposal Classification [21.190465278587045]
Proposal classification aims to classify a proposal into a length-variant sequence of labels.
We develop a new deep proposal classification framework to jointly model the three features.
Our model can automatically identify the best length of label sequence to stop next label prediction.
arXiv Detail & Related papers (2021-09-14T13:09:28Z) - A Unified Objective for Novel Class Discovery [48.1003877511578]
We introduce a UNified Objective function (UNO) for discovering novel classes.
UNO favors synergy between supervised and unsupervised learning.
Despite its simplicity, UNO outperforms the state of the art on several benchmarks.
arXiv Detail & Related papers (2021-08-19T07:22:29Z) - Enhancing Label Correlation Feedback in Multi-Label Text Classification
via Multi-Task Learning [6.1538971100140145]
We introduce a novel approach with multi-task learning to enhance label correlation feedback.
We propose two auxiliary label co-occurrence prediction tasks to enhance label correlation learning.
arXiv Detail & Related papers (2021-06-06T12:26:14Z) - Pointwise Binary Classification with Pairwise Confidence Comparisons [97.79518780631457]
We propose pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other.
We link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization.
arXiv Detail & Related papers (2020-10-05T09:23:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.