Pseudo Label Selection is a Decision Problem
- URL: http://arxiv.org/abs/2309.13926v2
- Date: Tue, 26 Sep 2023 07:43:09 GMT
- Title: Pseudo Label Selection is a Decision Problem
- Authors: Julian Rodemann
- Abstract summary: Pseudo-Labeling is a simple and effective approach to semi-supervised learning.
It requires criteria that guide the selection of pseudo-labeled data.
Overfitting can be propagated to the final model by choosing instances with overconfident but wrong predictions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pseudo-Labeling is a simple and effective approach to semi-supervised
learning. It requires criteria that guide the selection of pseudo-labeled data.
The latter have been shown to crucially affect pseudo-labeling's generalization
performance. Several such criteria exist and were proven to work reasonably
well in practice. However, their performance often depends on the initial model
fit on labeled data. Early overfitting can be propagated to the final model by
choosing instances with overconfident but wrong predictions, often called
confirmation bias. In two recent works, we demonstrate that pseudo-label
selection (PLS) can be naturally embedded into decision theory. This paves the
way for BPLS, a Bayesian framework for PLS that mitigates the issue of
confirmation bias. At its heart is a novel selection criterion: an analytical
approximation of the posterior predictive of pseudo-samples and labeled data.
We derive this selection criterion by proving Bayes-optimality of this "pseudo
posterior predictive". We empirically assess BPLS for generalized linear,
non-parametric generalized additive models and Bayesian neural networks on
simulated and real-world data. When faced with data prone to overfitting and
thus a high chance of confirmation bias, BPLS outperforms traditional PLS
methods. The decision-theoretic embedding further allows us to render PLS more
robust towards the involved modeling assumptions. To achieve this goal, we
introduce a multi-objective utility function. We demonstrate that the latter
can be constructed to account for different sources of uncertainty and explore
three examples: model selection, accumulation of errors and covariate shift.
Related papers
- Exploring Beyond Logits: Hierarchical Dynamic Labeling Based on Embeddings for Semi-Supervised Classification [49.09505771145326]
We propose a Hierarchical Dynamic Labeling (HDL) algorithm that does not depend on model predictions and utilizes image embeddings to generate sample labels.
Our approach has the potential to change the paradigm of pseudo-label generation in semi-supervised learning.
arXiv Detail & Related papers (2024-04-26T06:00:27Z) - IBADR: an Iterative Bias-Aware Dataset Refinement Framework for
Debiasing NLU models [52.03761198830643]
We propose IBADR, an Iterative Bias-Aware dataset Refinement framework.
We first train a shallow model to quantify the bias degree of samples in the pool.
Then, we pair each sample with a bias indicator representing its bias degree, and use these extended samples to train a sample generator.
In this way, this generator can effectively learn the correspondence relationship between bias indicators and samples.
arXiv Detail & Related papers (2023-11-01T04:50:38Z) - Leveraging Ensemble Diversity for Robust Self-Training in the Presence of Sample Selection Bias [5.698050337128548]
Self-training is a well-known approach for semi-supervised learning. It consists of iteratively assigning pseudo-labels to unlabeled data for which the model is confident and treating them as labeled examples.
For neural networks, softmax prediction probabilities are often used as a confidence measure, although they are known to be overconfident, even for wrong predictions.
We propose a novel confidence measure, called $mathcalT$-similarity, built upon the prediction diversity of an ensemble of linear classifiers.
arXiv Detail & Related papers (2023-10-23T11:30:06Z) - Large Language Models Are Not Robust Multiple Choice Selectors [117.72712117510953]
Multiple choice questions (MCQs) serve as a common yet important task format in the evaluation of large language models (LLMs)
This work shows that modern LLMs are vulnerable to option position changes due to their inherent "selection bias"
We propose a label-free, inference-time debiasing method, called PriDe, which separates the model's prior bias for option IDs from the overall prediction distribution.
arXiv Detail & Related papers (2023-09-07T17:44:56Z) - In all LikelihoodS: How to Reliably Select Pseudo-Labeled Data for
Self-Training in Semi-Supervised Learning [0.0]
Self-training is a simple yet effective method within semi-supervised learning.
In this paper, we aim at rendering PLS more robust towards the involved modeling assumptions.
Results suggest that in particular robustness w.r.t. model choice can lead to substantial accuracy gains.
arXiv Detail & Related papers (2023-03-02T10:00:37Z) - Approximately Bayes-Optimal Pseudo Label Selection [0.5249805590164901]
Semi-supervised learning by self-training heavily relies on pseudo-label selection (PLS)
Early overfitting might thus be propagated to the final model by selecting instances with overconfident but erroneous predictions.
This paper introduces BPLS, a Bayesian framework for PLS that aims to mitigate this issue.
arXiv Detail & Related papers (2023-02-17T14:07:32Z) - Correcting Model Bias with Sparse Implicit Processes [0.9187159782788579]
We show that Sparse Implicit Processes (SIP) is capable of correcting model bias when the data generating mechanism differs strongly from the one implied by the model.
We use synthetic datasets to show that SIP is capable of providing predictive distributions that reflect the data better than the exact predictions of the initial, but wrongly assumed model.
arXiv Detail & Related papers (2022-07-21T18:00:01Z) - LOPS: Learning Order Inspired Pseudo-Label Selection for Weakly
Supervised Text Classification [28.37907856670151]
Pseudo-labels are noisy due to their nature, so selecting the correct ones has a huge potential for performance boost.
We propose a novel pseudo-label selection method LOPS that memorize takes learning order of samples into consideration.
LOPS can be viewed as a strong performance-boost plug-in to most of existing weakly-supervised text classification methods.
arXiv Detail & Related papers (2022-05-25T06:46:48Z) - Training on Test Data with Bayesian Adaptation for Covariate Shift [96.3250517412545]
Deep neural networks often make inaccurate predictions with unreliable uncertainty estimates.
We derive a Bayesian model that provides for a well-defined relationship between unlabeled inputs under distributional shift and model parameters.
We show that our method improves both accuracy and uncertainty estimation.
arXiv Detail & Related papers (2021-09-27T01:09:08Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Scalable Marginal Likelihood Estimation for Model Selection in Deep
Learning [78.83598532168256]
Marginal-likelihood based model-selection is rarely used in deep learning due to estimation difficulties.
Our work shows that marginal likelihoods can improve generalization and be useful when validation data is unavailable.
arXiv Detail & Related papers (2021-04-11T09:50:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.