Related papers: Towards Self-Adaptive Pseudo-Label Filtering for Semi-Supervised Learning

Towards Self-Adaptive Pseudo-Label Filtering for Semi-Supervised Learning

URL: http://arxiv.org/abs/2309.09774v1
Date: Mon, 18 Sep 2023 13:57:16 GMT
Title: Towards Self-Adaptive Pseudo-Label Filtering for Semi-Supervised Learning
Authors: Lei Zhu, Zhanghan Ke, Rynson Lau
Abstract summary: We propose a Self-Adaptive Pseudo-Label Filter (SPF) to improve the quality of pseudo labels. With an online mixture model, we weight each pseudo-labeled sample by the posterior of it being correct, which takes into consideration the confidence distribution. Our SPF evolves together with the deep neural network without manual tuning.
Score: 13.02771721554445
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent semi-supervised learning (SSL) methods typically include a filtering strategy to improve the quality of pseudo labels. However, these filtering strategies are usually hand-crafted and do not change as the model is updated, resulting in a lot of correct pseudo labels being discarded and incorrect pseudo labels being selected during the training process. In this work, we observe that the distribution gap between the confidence values of correct and incorrect pseudo labels emerges at the very beginning of the training, which can be utilized to filter pseudo labels. Based on this observation, we propose a Self-Adaptive Pseudo-Label Filter (SPF), which automatically filters noise in pseudo labels in accordance with model evolvement by modeling the confidence distribution throughout the training process. Specifically, with an online mixture model, we weight each pseudo-labeled sample by the posterior of it being correct, which takes into consideration the confidence distribution at that time. Unlike previous handcrafted filters, our SPF evolves together with the deep neural network without manual tuning. Extensive experiments demonstrate that incorporating SPF into the existing SSL methods can help improve the performance of SSL, especially when the labeled data is extremely scarce.

Related papers

Towards Micro-Action Recognition with Limited Annotations: An Asynchronous Pseudo Labeling and Training Approach [35.32024173141412]
We introduce the setting of Semi-Supervised MAR (SSMAR), where only a part of samples are labeled. Traditional Semi-Supervised Learning (SSL) methods tend to overfit on inaccurate pseudo-labels, leading to error accumulation and degraded performance. We propose Asynchronous Pseudo Labeling and Training (APLT), which explicitly separates the pseudo-labeling process from model training.
arXiv Detail & Related papers (2025-04-10T14:22:15Z)
Less is More: Pseudo-Label Filtering for Continual Test-Time Adaptation [13.486951040331899]
Continual Test-Time Adaptation (CTTA) aims to adapt a pre-trained model to a sequence of target domains during the test phase without accessing the source data. Existing methods rely on constructing pseudo-labels for all samples and updating the model through self-training. We propose Pseudo Labeling Filter (PLF) to improve the quality of pseudo-labels.
arXiv Detail & Related papers (2024-06-03T04:09:36Z)
Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation [87.17768598044427]
Traditional semi-supervised learning assumes that the feature distributions of labeled and unlabeled data are consistent. We propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions. Our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
arXiv Detail & Related papers (2024-05-31T03:13:45Z)
Uncertainty-Aware Pseudo-Label Filtering for Source-Free Unsupervised Domain Adaptation [45.53185386883692]
Source-free unsupervised domain adaptation (SFUDA) aims to enable the utilization of a pre-trained source model in an unlabeled target domain without access to source data. We propose a method called Uncertainty-aware Pseudo-label-filtering Adaptation (UPA) to efficiently address this issue in a coarse-to-fine manner.
arXiv Detail & Related papers (2024-03-17T16:19:40Z)
Semi-Supervised Class-Agnostic Motion Prediction with Pseudo Label Regeneration and BEVMix [59.55173022987071]
We study the potential of semi-supervised learning for class-agnostic motion prediction. Our framework adopts a consistency-based self-training paradigm, enabling the model to learn from unlabeled data. Our method exhibits comparable performance to weakly and some fully supervised methods.
arXiv Detail & Related papers (2023-12-13T09:32:50Z)
InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised Learning [34.062061310242385]
We present a new perspective of pseudo-labeling for imbalanced semi-supervised learning (SSL) We measure whether an unlabeled sample is likely to be in-distribution'' or out-of-distribution'' Experiments demonstrate that our energy-based pseudo-labeling method, textbfInPL, significantly outperforms confidence-based methods on imbalanced SSL benchmarks.
arXiv Detail & Related papers (2023-03-13T16:45:41Z)
Filter and evolve: progressive pseudo label refining for semi-supervised automatic speech recognition [5.735000563764309]
Low quality pseudo labels can misguide decision boundaries and degrade performance. We propose a simple yet effective strategy to filter low quality pseudo labels. Experiments on LibriSpeech show that these filtered samples enable the refined model to yield more correct predictions.
arXiv Detail & Related papers (2022-10-28T16:15:58Z)
Self-Tuning for Data-Efficient Deep Learning [75.34320911480008]
Self-Tuning is a novel approach to enable data-efficient deep learning. It unifies the exploration of labeled and unlabeled data and the transfer of a pre-trained model. It outperforms its SSL and TL counterparts on five tasks by sharp margins.
arXiv Detail & Related papers (2021-02-25T14:56:19Z)
In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation. We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models. We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
Two-phase Pseudo Label Densification for Self-training based Domain Adaptation [93.03265290594278]
We propose a novel Two-phase Pseudo Label Densification framework, referred to as TPLD. In the first phase, we use sliding window voting to propagate the confident predictions, utilizing intrinsic spatial-correlations in the images. In the second phase, we perform a confidence-based easy-hard classification. To ease the training process and avoid noisy predictions, we introduce the bootstrapping mechanism to the original self-training loss.
arXiv Detail & Related papers (2020-12-09T02:35:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.