Open-World Semi-Supervised Learning
- URL: http://arxiv.org/abs/2102.03526v1
- Date: Sat, 6 Feb 2021 07:11:07 GMT
- Title: Open-World Semi-Supervised Learning
- Authors: Kaidi Cao, Maria Brbic, Jure Leskovec
- Abstract summary: We introduce a new open-world semi-supervised learning setting in which the model is required to recognize previously seen classes.
We propose ORCA, an approach that learns to simultaneously classify and cluster the data.
We demonstrate that ORCA accurately discovers novel classes and assigns samples to previously seen classes on benchmark image classification datasets.
- Score: 66.90703597468377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Supervised and semi-supervised learning methods have been traditionally
designed for the closed-world setting based on the assumption that unlabeled
test data contains only classes previously encountered in the labeled training
data. However, the real world is inherently open and dynamic, and thus novel,
previously unseen classes may appear in the test data or during the model
deployment. Here, we introduce a new open-world semi-supervised learning
setting in which the model is required to recognize previously seen classes, as
well as to discover novel classes never seen in the labeled dataset. To tackle
the problem, we propose ORCA, an approach that learns to simultaneously
classify and cluster the data. ORCA classifies examples from the unlabeled
dataset to previously seen classes, or forms a novel class by grouping similar
examples together. The key idea in ORCA is in introducing uncertainty based
adaptive margin that effectively circumvents the bias caused by the imbalance
of variance between seen and novel classes/clusters. We demonstrate that ORCA
accurately discovers novel classes and assigns samples to previously seen
classes on benchmark image classification datasets, including CIFAR and
ImageNet. Remarkably, despite solving the harder task ORCA outperforms
semi-supervised methods on seen classes, as well as novel class discovery
methods on novel classes, achieving 7% and 151% improvements on seen and novel
classes in the ImageNet dataset.
Related papers
- Happy: A Debiased Learning Framework for Continual Generalized Category Discovery [54.54153155039062]
This paper explores the underexplored task of Continual Generalized Category Discovery (C-GCD)
C-GCD aims to incrementally discover new classes from unlabeled data while maintaining the ability to recognize previously learned classes.
We introduce a debiased learning framework, namely Happy, characterized by Hardness-aware prototype sampling and soft entropy regularization.
arXiv Detail & Related papers (2024-10-09T04:18:51Z) - Robust Semi-Supervised Learning for Self-learning Open-World Classes [5.714673612282175]
In real-world applications, unlabeled data always contain classes not present in the labeled set.
We propose an open-world SSL method for Self-learning Open-world Classes (SSOC), which can explicitly self-learn multiple unknown classes.
SSOC outperforms the state-of-the-art baselines on multiple popular classification benchmarks.
arXiv Detail & Related papers (2024-01-15T09:27:46Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - Bridging the Gap: Learning Pace Synchronization for Open-World Semi-Supervised Learning [44.91863420044712]
In open-world semi-supervised learning, a machine learning model is tasked with uncovering novel categories from unlabeled data.
We introduce 1) the adaptive synchronizing marginal loss which imposes class-specific negative margins to alleviate the model bias towards seen classes, and 2) the pseudo-label contrastive clustering which exploits pseudo-labels predicted by the model to group unlabeled data from the same category together.
Our method balances the learning pace between seen and novel classes, achieving a remarkable 3% average accuracy increase on the ImageNet dataset.
arXiv Detail & Related papers (2023-09-21T09:44:39Z) - Open-world Semi-supervised Novel Class Discovery [12.910670907071523]
We introduce a new open-world semi-supervised novel class discovery approach named OpenNCD.
The proposed method is composed of two reciprocally enhanced parts. First, a bi-level contrastive learning method is introduced, which maintains the pair-wise similarity of the prototypes.
The results show the effectiveness of the proposed method in open-world scenarios, especially with scarce known classes and labels.
arXiv Detail & Related papers (2023-05-22T14:59:50Z) - Novel Class Discovery in Semantic Segmentation [104.30729847367104]
We introduce a new setting of Novel Class Discovery in Semantic (NCDSS)
It aims at segmenting unlabeled images containing new classes given prior knowledge from a labeled set of disjoint classes.
In NCDSS, we need to distinguish the objects and background, and to handle the existence of multiple classes within an image.
We propose the Entropy-based Uncertainty Modeling and Self-training (EUMS) framework to overcome noisy pseudo-labels.
arXiv Detail & Related papers (2021-12-03T13:31:59Z) - Bridging Non Co-occurrence with Unlabeled In-the-wild Data for
Incremental Object Detection [56.22467011292147]
Several incremental learning methods are proposed to mitigate catastrophic forgetting for object detection.
Despite the effectiveness, these methods require co-occurrence of the unlabeled base classes in the training data of the novel classes.
We propose the use of unlabeled in-the-wild data to bridge the non-occurrence caused by the missing base classes during the training of additional novel classes.
arXiv Detail & Related papers (2021-10-28T10:57:25Z) - Automatically Discovering and Learning New Visual Categories with
Ranking Statistics [145.89790963544314]
We tackle the problem of discovering novel classes in an image collection given labelled examples of other classes.
We learn a general-purpose clustering model and use the latter to identify the new classes in the unlabelled data.
We evaluate our approach on standard classification benchmarks and outperform current methods for novel category discovery by a significant margin.
arXiv Detail & Related papers (2020-02-13T18:53:32Z) - Semi-Supervised Class Discovery [7.123519086758813]
We introduce the dataset Reconstruction Accuracy, a new and important measure of the effectiveness of a model's ability to create labels.
We apply a new, class learnability, for deciding whether a class is worthy of addition to the training dataset.
We show that our class discovery system can be successfully applied to vision and language.
arXiv Detail & Related papers (2020-02-10T00:29:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.