Boosting Semi-Supervised Learning with Contrastive Complementary
Labeling
- URL: http://arxiv.org/abs/2212.06643v1
- Date: Tue, 13 Dec 2022 15:25:49 GMT
- Title: Boosting Semi-Supervised Learning with Contrastive Complementary
Labeling
- Authors: Qinyi Deng, Yong Guo, Zhibang Yang, Haolin Pan, Jian Chen
- Abstract summary: A popular approach is pseudo-labeling that generates pseudo labels only for those unlabeled data with high-confidence predictions.
We highlight that data with low-confidence pseudo labels can be still beneficial to the training process.
Inspired by this, we propose a novel Contrastive Complementary Labeling (CCL) method that constructs a large number of reliable negative pairs.
- Score: 11.851898765002334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning (SSL) has achieved great success in leveraging a
large amount of unlabeled data to learn a promising classifier. A popular
approach is pseudo-labeling that generates pseudo labels only for those
unlabeled data with high-confidence predictions. As for the low-confidence
ones, existing methods often simply discard them because these unreliable
pseudo labels may mislead the model. Nevertheless, we highlight that these data
with low-confidence pseudo labels can be still beneficial to the training
process. Specifically, although the class with the highest probability in the
prediction is unreliable, we can assume that this sample is very unlikely to
belong to the classes with the lowest probabilities. In this way, these data
can be also very informative if we can effectively exploit these complementary
labels, i.e., the classes that a sample does not belong to. Inspired by this,
we propose a novel Contrastive Complementary Labeling (CCL) method that
constructs a large number of reliable negative pairs based on the complementary
labels and adopts contrastive learning to make use of all the unlabeled data.
Extensive experiments demonstrate that CCL significantly improves the
performance on top of existing methods. More critically, our CCL is
particularly effective under the label-scarce settings. For example, we yield
an improvement of 2.43% over FixMatch on CIFAR-10 only with 40 labeled data.
Related papers
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - Learning with Confidence: Training Better Classifiers from Soft Labels [0.0]
In supervised machine learning, models are typically trained using data with hard labels, i.e., definite assignments of class membership.
We investigate whether incorporating label uncertainty, represented as discrete probability distributions over the class labels, improves the predictive performance of classification models.
arXiv Detail & Related papers (2024-09-24T13:12:29Z) - FlatMatch: Bridging Labeled Data and Unlabeled Data with Cross-Sharpness
for Semi-Supervised Learning [73.13448439554497]
Semi-Supervised Learning (SSL) has been an effective way to leverage abundant unlabeled data with extremely scarce labeled data.
Most SSL methods are commonly based on instance-wise consistency between different data transformations.
We propose FlatMatch which minimizes a cross-sharpness measure to ensure consistent learning performance between the two datasets.
arXiv Detail & Related papers (2023-10-25T06:57:59Z) - Boosting Semi-Supervised Learning by bridging high and low-confidence
predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL)
We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z) - Class-Distribution-Aware Pseudo Labeling for Semi-Supervised Multi-Label
Learning [97.88458953075205]
Pseudo-labeling has emerged as a popular and effective approach for utilizing unlabeled data.
This paper proposes a novel solution called Class-Aware Pseudo-Labeling (CAP) that performs pseudo-labeling in a class-aware manner.
arXiv Detail & Related papers (2023-05-04T12:52:18Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Dist-PU: Positive-Unlabeled Learning from a Label Distribution
Perspective [89.5370481649529]
We propose a label distribution perspective for PU learning in this paper.
Motivated by this, we propose to pursue the label distribution consistency between predicted and ground-truth label distributions.
Experiments on three benchmark datasets validate the effectiveness of the proposed method.
arXiv Detail & Related papers (2022-12-06T07:38:29Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.