Towards Semi-supervised Learning with Non-random Missing Labels
- URL: http://arxiv.org/abs/2308.08872v1
- Date: Thu, 17 Aug 2023 09:09:36 GMT
- Title: Towards Semi-supervised Learning with Non-random Missing Labels
- Authors: Yue Duan, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi
- Abstract summary: Class transition tracking based Pseudo-Rectifying Guidance (PRG) is devised for label Missing Not At Random (MNAR)
PRG unifies the historical information of class distribution and class transitions caused by the pseudo-rectifying procedure.
We show the superior performance of PRG across a variety of MNAR scenarios, outperforming the latest SSL approaches.
- Score: 42.71454054383897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning (SSL) tackles the label missing problem by enabling
the effective usage of unlabeled data. While existing SSL methods focus on the
traditional setting, a practical and challenging scenario called label Missing
Not At Random (MNAR) is usually ignored. In MNAR, the labeled and unlabeled
data fall into different class distributions resulting in biased label
imputation, which deteriorates the performance of SSL models. In this work,
class transition tracking based Pseudo-Rectifying Guidance (PRG) is devised for
MNAR. We explore the class-level guidance information obtained by the Markov
random walk, which is modeled on a dynamically created graph built over the
class tracking matrix. PRG unifies the historical information of class
distribution and class transitions caused by the pseudo-rectifying procedure to
maintain the model's unbiased enthusiasm towards assigning pseudo-labels to all
classes, so as the quality of pseudo-labels on both popular classes and rare
classes in MNAR could be improved. Finally, we show the superior performance of
PRG across a variety of MNAR scenarios, outperforming the latest SSL approaches
combining bias removal solutions by a large margin. Code and model weights are
available at https://github.com/NJUyued/PRG4SSL-MNAR.
Related papers
- Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation [87.17768598044427]
Traditional semi-supervised learning assumes that the feature distributions of labeled and unlabeled data are consistent.
We propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions.
Our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
arXiv Detail & Related papers (2024-05-31T03:13:45Z) - Semi-Supervised Learning with Multiple Imputations on Non-Random Missing
Labels [0.0]
Semi-Supervised Learning (SSL) is implemented when algorithms are trained on both labeled and unlabeled data.
This paper proposes two new methods of combining multiple imputation models to achieve higher accuracy and less bias.
arXiv Detail & Related papers (2023-08-15T04:09:53Z) - Contrastive Credibility Propagation for Reliable Semi-Supervised Learning [6.014538614447467]
We propose Contrastive Credibility Propagation (CCP) for deep SSL via iterative transductive pseudo-label refinement.
CCP unifies semi-supervised learning and noisy label learning for the goal of reliably outperforming a supervised baseline in any data scenario.
arXiv Detail & Related papers (2022-11-17T23:01:47Z) - NorMatch: Matching Normalizing Flows with Discriminative Classifiers for
Semi-Supervised Learning [8.749830466953584]
Semi-Supervised Learning (SSL) aims to learn a model using a tiny labeled set and massive amounts of unlabeled data.
In this work we introduce a new framework for SSL named NorMatch.
We demonstrate, through numerical and visual results, that NorMatch achieves state-of-the-art performance on several datasets.
arXiv Detail & Related papers (2022-11-17T15:39:18Z) - On Non-Random Missing Labels in Semi-Supervised Learning [114.62655062520425]
Semi-Supervised Learning (SSL) is fundamentally a missing label problem.
We explicitly incorporate "class" into SSL.
Our method not only significantly outperforms existing baselines but also surpasses other label bias removal SSL methods.
arXiv Detail & Related papers (2022-06-29T22:01:29Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Distribution Aligning Refinery of Pseudo-label for Imbalanced
Semi-supervised Learning [126.31716228319902]
We develop Distribution Aligning Refinery of Pseudo-label (DARP) algorithm.
We show that DARP is provably and efficiently compatible with state-of-the-art SSL schemes.
arXiv Detail & Related papers (2020-07-17T09:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.