FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning
- URL: http://arxiv.org/abs/2205.07246v1
- Date: Sun, 15 May 2022 10:07:52 GMT
- Title: FreeMatch: Self-adaptive Thresholding for Semi-supervised Learning
- Authors: Yidong Wang, Hao Chen, Qiang Heng, Wenxin Hou, Marios Savvides,
Takahiro Shinozaki, Bhiksha Raj, Zhen Wu, Jindong Wang
- Abstract summary: We propose emphFreeMatch to define and adjust the confidence threshold in a self-adaptive manner according to the model's learning status.
FreeMatch achieves textbf5.78%, textbf13.59%, and textbf1.28% error rate reduction over the latest state-of-the-art method FlexMatch on CIFAR-10 with 1 label per class.
- Score: 46.95063831057502
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Pseudo labeling and consistency regularization approaches with
confidence-based thresholding have made great progress in semi-supervised
learning (SSL). In this paper, we theoretically and empirically analyze the
relationship between the unlabeled data distribution and the desirable
confidence threshold. Our analysis shows that previous methods might fail to
define favorable threshold since they either require a pre-defined / fixed
threshold or an ad-hoc threshold adjusting scheme that does not reflect the
learning effect well, resulting in inferior performance and slow convergence,
especially for complicated unlabeled data distributions. We hence propose
\emph{FreeMatch} to define and adjust the confidence threshold in a
self-adaptive manner according to the model's learning status. To handle
complicated unlabeled data distributions more effectively, we further propose a
self-adaptive class fairness regularization method that encourages the model to
produce diverse predictions during training. Extensive experimental results
indicate the superiority of FreeMatch especially when the labeled data are
extremely rare. FreeMatch achieves \textbf{5.78}\%, \textbf{13.59}\%, and
\textbf{1.28}\% error rate reduction over the latest state-of-the-art method
FlexMatch on CIFAR-10 with 1 label per class, STL-10 with 4 labels per class,
and ImageNet with 100k labels respectively.
Related papers
- Enhancing Zero-Shot Vision Models by Label-Free Prompt Distribution Learning and Bias Correcting [55.361337202198925]
Vision-language models, such as CLIP, have shown impressive generalization capacities when using appropriate text descriptions.
We propose a label-Free prompt distribution learning and bias correction framework, dubbed as **Frolic**, which boosts zero-shot performance without the need for labeled data.
arXiv Detail & Related papers (2024-10-25T04:00:45Z) - A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised Classification [61.473485511491795]
Semi-supervised learning (SSL) is a practical challenge in computer vision.
Pseudo-label (PL) methods, e.g., FixMatch and FreeMatch, obtain the State Of The Art (SOTA) performances in SSL.
We propose a lightweight channel-based ensemble method to consolidate multiple inferior PLs into the theoretically guaranteed unbiased and low-variance one.
arXiv Detail & Related papers (2024-03-27T09:49:37Z) - Boosting Semi-Supervised Learning by bridging high and low-confidence
predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL)
We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z) - Boosting Semi-Supervised Learning by Exploiting All Unlabeled Data [21.6350640726058]
Semi-supervised learning (SSL) has attracted enormous attention due to its vast potential of mitigating the dependence on large labeled datasets.
We propose two novel techniques: Entropy Meaning Loss (EML) and Adaptive Negative Learning (ANL)
We integrate these techniques with FixMatch, and develop a simple yet powerful framework called FullMatch.
arXiv Detail & Related papers (2023-03-20T12:44:11Z) - InPL: Pseudo-labeling the Inliers First for Imbalanced Semi-supervised
Learning [34.062061310242385]
We present a new perspective of pseudo-labeling for imbalanced semi-supervised learning (SSL)
We measure whether an unlabeled sample is likely to be in-distribution'' or out-of-distribution''
Experiments demonstrate that our energy-based pseudo-labeling method, textbfInPL, significantly outperforms confidence-based methods on imbalanced SSL benchmarks.
arXiv Detail & Related papers (2023-03-13T16:45:41Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - MutexMatch: Semi-supervised Learning with Mutex-based Consistency
Regularization [36.019086181632005]
We propose a mutex-based consistency regularization, namely Mutex, to utilize low-confidence samples in a novel way.
MutexMatch achieves superior performance on multiple benchmark datasets, i.e., CIFAR-10, CIFAR-100, SVHN, STL-10, and mini-ImageNet.
arXiv Detail & Related papers (2022-03-27T14:28:16Z) - OpenMatch: Open-set Consistency Regularization for Semi-supervised
Learning with Outliers [71.08167292329028]
We propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch.
OpenMatch unifies FixMatch with novelty detection based on one-vs-all (OVA) classifiers.
It achieves state-of-the-art performance on three datasets, and even outperforms a fully supervised model in detecting outliers unseen in unlabeled data on CIFAR10.
arXiv Detail & Related papers (2021-05-28T23:57:15Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.