Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning
- URL: http://arxiv.org/abs/2106.05682v1
- Date: Thu, 10 Jun 2021 11:58:25 GMT
- Title: Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning
- Authors: Youngtaek Oh, Dong-Jin Kim, In So Kweon
- Abstract summary: This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
- Score: 80.05441565830726
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The capability of the traditional semi-supervised learning (SSL) methods is
far from real-world application since they do not consider (1) class imbalance
and (2) class distribution mismatch between labeled and unlabeled data. This
paper addresses such a relatively under-explored problem, imbalanced
semi-supervised learning, where heavily biased pseudo-labels can harm the model
performance. Interestingly, we find that the semantic pseudo-labels from a
similarity-based classifier in feature space and the traditional pseudo-labels
from the linear classifier show the complementary property. To this end, we
propose a general pseudo-labeling framework to address the bias motivated by
this observation. The key idea is to class-adaptively blend the semantic
pseudo-label to the linear one, depending on the current pseudo-label
distribution. Thereby, the increased semantic pseudo-label component suppresses
the false positives in the majority classes and vice versa. We term the novel
pseudo-labeling framework for imbalanced SSL as Distribution-Aware
Semantics-Oriented (DASO) Pseudo-label. Extensive evaluation on CIFAR10/100-LT
and STL10-LT shows that DASO consistently outperforms both recently proposed
re-balancing methods for label and pseudo-label. Moreover, we demonstrate that
typical SSL algorithms can effectively benefit from unlabeled data with DASO,
especially when (1) class imbalance and (2) class distribution mismatch exist
and even on recent real-world Semi-Aves benchmark.
Related papers
- Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation [87.17768598044427]
Traditional semi-supervised learning assumes that the feature distributions of labeled and unlabeled data are consistent.
We propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions.
Our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
arXiv Detail & Related papers (2024-05-31T03:13:45Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - CLS: Cross Labeling Supervision for Semi-Supervised Learning [9.929229055862491]
Cross Labeling Supervision ( CLS) is a framework that generalizes the typical pseudo-labeling process.
CLS allows the creation of both pseudo and complementary labels to support both positive and negative learning.
arXiv Detail & Related papers (2022-02-17T08:09:40Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z) - Distribution Aligning Refinery of Pseudo-label for Imbalanced
Semi-supervised Learning [126.31716228319902]
We develop Distribution Aligning Refinery of Pseudo-label (DARP) algorithm.
We show that DARP is provably and efficiently compatible with state-of-the-art SSL schemes.
arXiv Detail & Related papers (2020-07-17T09:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.