Class-Imbalanced Semi-Supervised Learning
- URL: http://arxiv.org/abs/2002.06815v1
- Date: Mon, 17 Feb 2020 07:48:47 GMT
- Title: Class-Imbalanced Semi-Supervised Learning
- Authors: Minsung Hyun, Jisoo Jeong and Nojun Kwak
- Abstract summary: Semi-Supervised Learning (SSL) has achieved great success in overcoming the difficulties of labeling and making full use of unlabeled data.
We introduce a task of class-imbalanced semi-supervised learning (CISSL), which refers to semi-supervised learning with class-imbalanced data.
Our method shows better performance than the conventional methods in the CISSL environment.
- Score: 33.94685366079589
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-Supervised Learning (SSL) has achieved great success in overcoming the
difficulties of labeling and making full use of unlabeled data. However, SSL
has a limited assumption that the numbers of samples in different classes are
balanced, and many SSL algorithms show lower performance for the datasets with
the imbalanced class distribution. In this paper, we introduce a task of
class-imbalanced semi-supervised learning (CISSL), which refers to
semi-supervised learning with class-imbalanced data. In doing so, we consider
class imbalance in both labeled and unlabeled sets. First, we analyze existing
SSL methods in imbalanced environments and examine how the class imbalance
affects SSL methods. Then we propose Suppressed Consistency Loss (SCL), a
regularization method robust to class imbalance. Our method shows better
performance than the conventional methods in the CISSL environment. In
particular, the more severe the class imbalance and the smaller the size of the
labeled data, the better our method performs.
Related papers
- On Pseudo-Labeling for Class-Mismatch Semi-Supervised Learning [50.48888534815361]
In this paper, we empirically analyze Pseudo-Labeling (PL) in class-mismatched SSL.
PL is a simple and representative SSL method that transforms SSL problems into supervised learning by creating pseudo-labels for unlabeled data.
We propose to improve PL in class-mismatched SSL with two components -- Re-balanced Pseudo-Labeling (RPL) and Semantic Exploration Clustering (SEC)
arXiv Detail & Related papers (2023-01-15T03:21:59Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - OpenLDN: Learning to Discover Novel Classes for Open-World
Semi-Supervised Learning [110.40285771431687]
Semi-supervised learning (SSL) is one of the dominant approaches to address the annotation bottleneck of supervised learning.
Recent SSL methods can effectively leverage a large repository of unlabeled data to improve performance while relying on a small set of labeled data.
This work introduces OpenLDN that utilizes a pairwise similarity loss to discover novel classes.
arXiv Detail & Related papers (2022-07-05T18:51:05Z) - BASIL: Balanced Active Semi-supervised Learning for Class Imbalanced
Datasets [14.739359755029353]
Current semi-supervised learning (SSL) methods assume a balance between the number of data points available for each class in both the labeled and the unlabeled data sets.
We propose BASIL, a novel algorithm that optimize the submodular mutual information (SMI) functions in a per-class fashion to gradually select a balanced dataset in an active learning loop.
arXiv Detail & Related papers (2022-03-10T21:34:08Z) - CoSSL: Co-Learning of Representation and Classifier for Imbalanced
Semi-Supervised Learning [98.89092930354273]
We propose a novel co-learning framework (CoSSL) with decoupled representation learning and classifier learning for imbalanced SSL.
To handle the data imbalance, we devise Tail-class Feature Enhancement (TFE) for classifier learning.
In experiments, we show that our approach outperforms other methods over a large range of shifted distributions.
arXiv Detail & Related papers (2021-12-08T20:13:13Z) - ABC: Auxiliary Balanced Classifier for Class-imbalanced Semi-supervised
Learning [6.866717993664787]
Existing semi-supervised learning (SSL) algorithms assume class-balanced datasets.
We propose a scalable class-imbalanced SSL algorithm that can effectively use unlabeled data.
The proposed algorithm achieves state-of-the-art performance in various class-imbalanced SSL experiments using four benchmark datasets.
arXiv Detail & Related papers (2021-10-20T04:07:48Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Distribution Aligning Refinery of Pseudo-label for Imbalanced
Semi-supervised Learning [126.31716228319902]
We develop Distribution Aligning Refinery of Pseudo-label (DARP) algorithm.
We show that DARP is provably and efficiently compatible with state-of-the-art SSL schemes.
arXiv Detail & Related papers (2020-07-17T09:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.