Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning
- URL: http://arxiv.org/abs/2305.02795v2
- Date: Sat, 20 May 2023 06:18:28 GMT
- Title: Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning
- Authors: Ming-Kun Xie, Jia-Hao Xiao, Hao-Zhe Liu, Gang Niu, Masashi Sugiyama,
Sheng-Jun Huang
- Abstract summary: Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
- Score: 97.88458953075205
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pseudo-labeling has emerged as a popular and effective approach for utilizing
unlabeled data. However, in the context of semi-supervised multi-label learning
(SSMLL), conventional pseudo-labeling methods encounter difficulties when
dealing with instances associated with multiple labels and an unknown label
count. These limitations often result in the introduction of false positive
labels or the neglect of true positive ones. To overcome these challenges, this
paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that
performs pseudo-labeling in a class-aware manner. The proposed approach
introduces a regularized learning framework incorporating class-aware
thresholds, which effectively control the assignment of positive and negative
pseudo-labels for each class. Notably, even with a small proportion of labeled
examples, our observations demonstrate that the estimated class distribution
serves as a reliable approximation. Motivated by this finding, we develop a
class-distribution-aware thresholding strategy to ensure the alignment of
pseudo-label distribution with the true distribution. The correctness of the
estimated class distribution is theoretically verified, and a generalization
error bound is provided for our proposed method. Extensive experiments on
multiple benchmark datasets confirm the efficacy of CAP in addressing the
challenges of SSMLL problems.
Related papers
- AllMatch: Exploiting All Unlabeled Data for Semi-Supervised Learning [5.0823084858349485]
We present a novel SSL algorithm named AllMatch, which achieves improved pseudo-label accuracy and a 100% utilization ratio for the unlabeled data.
The results demonstrate that AllMatch consistently outperforms existing state-of-the-art methods.
arXiv Detail & Related papers (2024-06-22T06:59:52Z) - Learning with Complementary Labels Revisited: The Selected-Completely-at-Random Setting Is More Practical [66.57396042747706]
Complementary-label learning is a weakly supervised learning problem.
We propose a consistent approach that does not rely on the uniform distribution assumption.
We find that complementary-label learning can be expressed as a set of negative-unlabeled binary classification problems.
arXiv Detail & Related papers (2023-11-27T02:59:17Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - One Positive Label is Sufficient: Single-Positive Multi-Label Learning
with Label Enhancement [71.9401831465908]
We investigate single-positive multi-label learning (SPMLL) where each example is annotated with only one relevant label.
A novel method named proposed, i.e., Single-positive MultI-label learning with Label Enhancement, is proposed.
Experiments on benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-06-01T14:26:30Z) - Multi-class Probabilistic Bounds for Self-learning [13.875239300089861]
Pseudo-labeling is prone to error and runs the risk of adding noisy labels into unlabeled training data.
We present a probabilistic framework for analyzing self-learning in the multi-class classification scenario with partially labeled data.
arXiv Detail & Related papers (2021-09-29T13:57:37Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.