IOMatch: Simplifying Open-Set Semi-Supervised Learning with Joint
Inliers and Outliers Utilization
- URL: http://arxiv.org/abs/2308.13168v1
- Date: Fri, 25 Aug 2023 04:14:02 GMT
- Title: IOMatch: Simplifying Open-Set Semi-Supervised Learning with Joint
Inliers and Outliers Utilization
- Authors: Zekun Li, Lei Qi, Yinghuan Shi, Yang Gao
- Abstract summary: In many real-world applications, unlabeled data will inevitably contain unseen-class outliers not belonging to any of the labeled classes.
We introduce a novel open-set SSL framework, IOMatch, which can jointly utilize inliers and outliers, even when it is difficult to distinguish exactly between them.
- Score: 36.102831230805755
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning (SSL) aims to leverage massive unlabeled data when
labels are expensive to obtain. Unfortunately, in many real-world applications,
the collected unlabeled data will inevitably contain unseen-class outliers not
belonging to any of the labeled classes. To deal with the challenging open-set
SSL task, the mainstream methods tend to first detect outliers and then filter
them out. However, we observe a surprising fact that such approach could result
in more severe performance degradation when labels are extremely scarce, as the
unreliable outlier detector may wrongly exclude a considerable portion of
valuable inliers. To tackle with this issue, we introduce a novel open-set SSL
framework, IOMatch, which can jointly utilize inliers and outliers, even when
it is difficult to distinguish exactly between them. Specifically, we propose
to employ a multi-binary classifier in combination with the standard closed-set
classifier for producing unified open-set classification targets, which regard
all outliers as a single new class. By adopting these targets as open-set
pseudo-labels, we optimize an open-set classifier with all unlabeled samples
including both inliers and outliers. Extensive experiments have shown that
IOMatch significantly outperforms the baseline methods across different
benchmark datasets and different settings despite its remarkable simplicity.
Our code and models are available at https://github.com/nukezil/IOMatch.
Related papers
- OwMatch: Conditional Self-Labeling with Consistency for Open-World Semi-Supervised Learning [4.462726364160216]
Semi-supervised learning (SSL) offers a robust framework for harnessing the potential of unannotated data.
The emergence of open-world SSL (OwSSL) introduces a more practical challenge, wherein unlabeled data may encompass samples from unseen classes.
We propose an effective framework called OwMatch, combining conditional self-labeling and open-world hierarchical thresholding.
arXiv Detail & Related papers (2024-11-04T06:07:43Z) - Robust Semi-Supervised Learning for Self-learning Open-World Classes [5.714673612282175]
In real-world applications, unlabeled data always contain classes not present in the labeled set.
We propose an open-world SSL method for Self-learning Open-world Classes (SSOC), which can explicitly self-learn multiple unknown classes.
SSOC outperforms the state-of-the-art baselines on multiple popular classification benchmarks.
arXiv Detail & Related papers (2024-01-15T09:27:46Z) - SSB: Simple but Strong Baseline for Boosting Performance of Open-Set
Semi-Supervised Learning [106.46648817126984]
In this paper, we study the challenging and realistic open-set SSL setting.
The goal is to both correctly classify inliers and to detect outliers.
We find that inlier classification performance can be largely improved by incorporating high-confidence pseudo-labeled data.
arXiv Detail & Related papers (2023-11-17T15:14:40Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - Semi-Supervised Learning in the Few-Shot Zero-Shot Scenario [14.916971861796384]
Semi-Supervised Learning (SSL) is a framework that utilizes both labeled and unlabeled data to enhance model performance.
We propose a general approach to augment existing SSL methods, enabling them to handle situations where certain classes are missing.
Our experimental results reveal significant improvements in accuracy when compared to state-of-the-art SSL, open-set SSL, and open-world SSL methods.
arXiv Detail & Related papers (2023-08-27T14:25:07Z) - Adaptive Negative Evidential Deep Learning for Open-set Semi-supervised Learning [69.81438976273866]
Open-set semi-supervised learning (Open-set SSL) considers a more practical scenario, where unlabeled data and test data contain new categories (outliers) not observed in labeled data (inliers)
We introduce evidential deep learning (EDL) as an outlier detector to quantify different types of uncertainty, and design different uncertainty metrics for self-training and inference.
We propose a novel adaptive negative optimization strategy, making EDL more tailored to the unlabeled dataset containing both inliers and outliers.
arXiv Detail & Related papers (2023-03-21T09:07:15Z) - OpenLDN: Learning to Discover Novel Classes for Open-World
Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning.
Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data.
This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z) - OpenCoS: Contrastive Semi-supervised Learning for Handling Open-set
Unlabeled Data [65.19205979542305]
Unlabeled data may include out-of-class samples in practice.
OpenCoS is a method for handling this realistic semi-supervised learning scenario.
arXiv Detail & Related papers (2021-06-29T06:10:05Z) - OpenMatch: Open-set Consistency Regularization for Semi-supervised
Learning with Outliers [71.08167292329028]
We propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch.
OpenMatch unifies FixMatch with novelty detection based on one-vs-all (OVA) classifiers.
It achieves state-of-the-art performance on three datasets, and even outperforms a fully supervised model in detecting outliers unseen in unlabeled data on CIFAR10.
arXiv Detail & Related papers (2021-05-28T23:57:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.