Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression
- URL: http://arxiv.org/abs/2311.01782v1
- Date: Fri, 3 Nov 2023 08:39:35 GMT
- Title: Generating Unbiased Pseudo-labels via a Theoretically Guaranteed
Chebyshev Constraint to Unify Semi-supervised Classification and Regression
- Authors: Jiaqi Wu, Junbiao Pang, Qingming Huang
- Abstract summary: threshold-to-pseudo label process (T2L) in classification uses confidence to determine the quality of label.
In nature, regression also requires unbiased methods to generate high-quality labels.
We propose a theoretically guaranteed constraint for generating unbiased labels based on Chebyshev's inequality.
- Score: 57.17120203327993
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Both semi-supervised classification and regression are practically
challenging tasks for computer vision. However, semi-supervised classification
methods are barely applied to regression tasks. Because the threshold-to-pseudo
label process (T2L) in classification uses confidence to determine the quality
of label. It is successful for classification tasks but inefficient for
regression tasks. In nature, regression also requires unbiased methods to
generate high-quality labels. On the other hand, T2L for classification often
fails if the confidence is generated by a biased method. To address this issue,
in this paper, we propose a theoretically guaranteed constraint for generating
unbiased labels based on Chebyshev's inequality, combining multiple predictions
to generate superior quality labels from several inferior ones. In terms of
high-quality labels, the unbiased method naturally avoids the drawback of T2L.
Specially, we propose an Unbiased Pseudo-labels network (UBPL network) with
multiple branches to combine multiple predictions as pseudo-labels, where a
Feature Decorrelation loss (FD loss) is proposed based on Chebyshev constraint.
In principle, our method can be used for both classification and regression and
can be easily extended to any semi-supervised framework, e.g. Mean Teacher,
FixMatch, DualPose. Our approach achieves superior performance over SOTAs on
the pose estimation datasets Mouse, FLIC and LSP, as well as the classification
datasets CIFAR10/100 and SVHN.
Related papers
- Partial-Label Regression [54.74984751371617]
Partial-label learning is a weakly supervised learning setting that allows each training example to be annotated with a set of candidate labels.
Previous studies on partial-label learning only focused on the classification setting where candidate labels are all discrete.
In this paper, we provide the first attempt to investigate partial-label regression, where each training example is annotated with a set of real-valued candidate labels.
arXiv Detail & Related papers (2023-06-15T09:02:24Z) - Lifting Weak Supervision To Structured Prediction [12.219011764895853]
Weak supervision (WS) is a rich set of techniques that produce pseudolabels by aggregating easily obtained but potentially noisy label estimates.
We introduce techniques new to weak supervision based on pseudo-Euclidean embeddings and tensor decompositions.
Several of our results, which can be viewed as robustness guarantees in structured prediction with noisy labels, may be of independent interest.
arXiv Detail & Related papers (2022-11-24T02:02:58Z) - Semi-supervised Contrastive Outlier removal for Pseudo Expectation
Maximization (SCOPE) [2.33877878310217]
We present a new approach to suppress confounding errors through a method we describe as Semi-supervised Contrastive Outlier removal for Pseudo Expectation Maximization (SCOPE)
Our results show that SCOPE greatly improves semi-supervised classification accuracy over a baseline, and furthermore when combined with consistency regularization achieves the highest reported accuracy for the semi-supervised CIFAR-10 classification task using 250 and 4000 labeled samples.
arXiv Detail & Related papers (2022-06-28T19:32:50Z) - Optimizing Diffusion Rate and Label Reliability in a Graph-Based
Semi-supervised Classifier [2.4366811507669124]
The Local and Global Consistency (LGC) algorithm is one of the most well-known graph-based semi-supervised (GSSL) classifiers.
We discuss how removing the self-influence of a labeled instance may be beneficial, and how it relates to leave-one-out error.
Within this framework, we propose methods to estimate label reliability and diffusion rate.
arXiv Detail & Related papers (2022-01-10T16:58:52Z) - Unbiased Loss Functions for Multilabel Classification with Missing
Labels [2.1549398927094874]
Missing labels are a ubiquitous phenomenon in extreme multi-label classification (XMC) tasks.
This paper derives the unique unbiased estimators for the different multilabel reductions.
arXiv Detail & Related papers (2021-09-23T10:39:02Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - Rethinking Pseudo Labels for Semi-Supervised Object Detection [84.697097472401]
We introduce certainty-aware pseudo labels tailored for object detection.
We dynamically adjust the thresholds used to generate pseudo labels and reweight loss functions for each category to alleviate the class imbalance problem.
Our approach improves supervised baselines by up to 10% AP using only 1-10% labeled data from COCO.
arXiv Detail & Related papers (2021-06-01T01:32:03Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Pointwise Binary Classification with Pairwise Confidence Comparisons [97.79518780631457]
We propose pairwise comparison (Pcomp) classification, where we have only pairs of unlabeled data that we know one is more likely to be positive than the other.
We link Pcomp classification to noisy-label learning to develop a progressive URE and improve it by imposing consistency regularization.
arXiv Detail & Related papers (2020-10-05T09:23:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.