CaliMatch: Adaptive Calibration for Improving Safe Semi-supervised Learning
- URL: http://arxiv.org/abs/2508.00922v1
- Date: Wed, 30 Jul 2025 07:00:56 GMT
- Title: CaliMatch: Adaptive Calibration for Improving Safe Semi-supervised Learning
- Authors: Jinsoo Bae, Seoung Bum Kim, Hyungrok Do,
- Abstract summary: Semi-supervised learning (SSL) uses unlabeled data to improve the performance of machine learning models when labeled data is scarce.<n>Recent studies, referred to as safe SSL, have addressed this issue by using both classification and out-of-distribution (OOD) detection.<n>We propose a novel method, CaliMatch, which calibrates both the classifier and the OOD detector to foster safe SSL.
- Score: 3.864321514889099
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semi-supervised learning (SSL) uses unlabeled data to improve the performance of machine learning models when labeled data is scarce. However, its real-world applications often face the label distribution mismatch problem, in which the unlabeled dataset includes instances whose ground-truth labels are absent from the labeled training dataset. Recent studies, referred to as safe SSL, have addressed this issue by using both classification and out-of-distribution (OOD) detection. However, the existing methods may suffer from overconfidence in deep neural networks, leading to increased SSL errors because of high confidence in incorrect pseudo-labels or OOD detection. To address this, we propose a novel method, CaliMatch, which calibrates both the classifier and the OOD detector to foster safe SSL. CaliMatch presents adaptive label smoothing and temperature scaling, which eliminates the need to manually tune the smoothing degree for effective calibration. We give a theoretical justification for why improving the calibration of both the classifier and the OOD detector is crucial in safe SSL. Extensive evaluations on CIFAR-10, CIFAR-100, SVHN, TinyImageNet, and ImageNet demonstrate that CaliMatch outperforms the existing methods in safe SSL tasks.
Related papers
- Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning [6.904448748214652]
Semi-supervised learning algorithms struggle to perform well when exposed to imbalanced training data.
We introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL)
SEVAL adapts to specific tasks with improved pseudo-labels accuracy and ensures pseudo-labels correctness on a per-class basis.
arXiv Detail & Related papers (2024-07-07T13:46:22Z) - Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation [87.17768598044427]
Traditional semi-supervised learning assumes that the feature distributions of labeled and unlabeled data are consistent.
We propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions.
Our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
arXiv Detail & Related papers (2024-05-31T03:13:45Z) - Adaptive Hierarchical Certification for Segmentation using Randomized Smoothing [87.48628403354351]
certification for machine learning is proving that no adversarial sample can evade a model within a range under certain conditions.
Common certification methods for segmentation use a flat set of fine-grained classes, leading to high abstain rates due to model uncertainty.
We propose a novel, more practical setting, which certifies pixels within a multi-level hierarchy, and adaptively relaxes the certification to a coarser level for unstable components.
arXiv Detail & Related papers (2024-02-13T11:59:43Z) - Semi-Supervised Learning in the Few-Shot Zero-Shot Scenario [14.916971861796384]
Semi-Supervised Learning (SSL) is a framework that utilizes both labeled and unlabeled data to enhance model performance.
We propose a general approach to augment existing SSL methods, enabling them to handle situations where certain classes are missing.
Our experimental results reveal significant improvements in accuracy when compared to state-of-the-art SSL, open-set SSL, and open-world SSL methods.
arXiv Detail & Related papers (2023-08-27T14:25:07Z) - Model Calibration in Dense Classification with Adaptive Label
Perturbation [44.62722402349157]
Existing dense binary classification models are prone to being over-confident.
We propose Adaptive Label Perturbation (ASLP) which learns a unique label perturbation level for each training image.
ASLP can significantly improve calibration degrees of dense binary classification models on both in-distribution and out-of-distribution data.
arXiv Detail & Related papers (2023-07-25T14:40:11Z) - Fix-A-Step: Semi-supervised Learning from Uncurated Unlabeled Data [4.779633174910461]
In real applications like medical imaging, unlabeled data will be collected for expediency and thus uncurated.
We introduce Fix-A-Step, a procedure that views all uncurated unlabeled images as potentially helpful.
On a new medical SSL benchmark called Heart2Heart, Fix-A-Step can learn from 353,500 truly uncurated ultrasound images to deliver gains that generalize across hospitals.
arXiv Detail & Related papers (2022-08-25T04:52:21Z) - Radio Galaxy Zoo: Using semi-supervised learning to leverage large
unlabelled data-sets for radio galaxy classification under data-set shift [0.0]
State-of-the-art semi-supervised learning algorithm applied to morphological classification of radio galaxies.
We test if SSL with fewer labels can achieve test accuracies comparable to the supervised state-of-the-art.
Improvement is limited to a narrow range of label volumes, with performance falling off rapidly at low label volumes.
arXiv Detail & Related papers (2022-04-19T11:38:22Z) - Class-Aware Contrastive Semi-Supervised Learning [51.205844705156046]
We propose a general method named Class-aware Contrastive Semi-Supervised Learning (CCSSL) to improve pseudo-label quality and enhance the model's robustness in the real-world setting.
Our proposed CCSSL has significant performance improvements over the state-of-the-art SSL methods on the standard datasets CIFAR100 and STL10.
arXiv Detail & Related papers (2022-03-04T12:18:23Z) - Robust Deep Semi-Supervised Learning: A Brief Introduction [63.09703308309176]
Semi-supervised learning (SSL) aims to improve learning performance by leveraging unlabeled data when labels are insufficient.
SSL with deep models has proven to be successful on standard benchmark tasks.
However, they are still vulnerable to various robustness threats in real-world applications.
arXiv Detail & Related papers (2022-02-12T04:16:41Z) - OpenMatch: Open-set Consistency Regularization for Semi-supervised
Learning with Outliers [71.08167292329028]
We propose a novel Open-set Semi-Supervised Learning (OSSL) approach called OpenMatch.
OpenMatch unifies FixMatch with novelty detection based on one-vs-all (OVA) classifiers.
It achieves state-of-the-art performance on three datasets, and even outperforms a fully supervised model in detecting outliers unseen in unlabeled data on CIFAR10.
arXiv Detail & Related papers (2021-05-28T23:57:15Z) - In Defense of Pseudo-Labeling: An Uncertainty-Aware Pseudo-label
Selection Framework for Semi-Supervised Learning [53.1047775185362]
Pseudo-labeling (PL) is a general SSL approach that does not have this constraint but performs relatively poorly in its original formulation.
We argue that PL underperforms due to the erroneous high confidence predictions from poorly calibrated models.
We propose an uncertainty-aware pseudo-label selection (UPS) framework which improves pseudo labeling accuracy by drastically reducing the amount of noise encountered in the training process.
arXiv Detail & Related papers (2021-01-15T23:29:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.