Sampling Control for Imbalanced Calibration in Semi-Supervised Learning
- URL: http://arxiv.org/abs/2511.18773v1
- Date: Mon, 24 Nov 2025 05:15:58 GMT
- Title: Sampling Control for Imbalanced Calibration in Semi-Supervised Learning
- Authors: Senmao Tian, Xiang Wei, Shunli Zhang,
- Abstract summary: Class imbalance remains a critical challenge in semi-supervised learning (SSL)<n>We propose a unified framework, SC-SSL, which suppresses model bias through decoupled sampling control.
- Score: 14.563492336625004
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Class imbalance remains a critical challenge in semi-supervised learning (SSL), especially when distributional mismatches between labeled and unlabeled data lead to biased classification. Although existing methods address this issue by adjusting logits based on the estimated class distribution of unlabeled data, they often handle model imbalance in a coarse-grained manner, conflating data imbalance with bias arising from varying class-specific learning difficulties. To address this issue, we propose a unified framework, SC-SSL, which suppresses model bias through decoupled sampling control. During training, we identify the key variables for sampling control under ideal conditions. By introducing a classifier with explicit expansion capability and adaptively adjusting sampling probabilities across different data distributions, SC-SSL mitigates feature-level imbalance for minority classes. In the inference phase, we further analyze the weight imbalance of the linear classifier and apply post-hoc sampling control with an optimization bias vector to directly calibrate the logits. Extensive experiments across various benchmark datasets and distribution settings validate the consistency and state-of-the-art performance of SC-SSL.
Related papers
- CalibrateMix: Guided-Mixup Calibration of Image Semi-Supervised Models [49.588973929678765]
CalibrateMix is a mixup-based approach that aims to improve the calibration of SSL models.<n>Our method achieves lower expected calibration error (ECE) and superior accuracy compared to existing SSL approaches.
arXiv Detail & Related papers (2025-11-17T04:43:53Z) - Rebalancing with Calibrated Sub-classes (RCS): A Statistical Fusion-based Framework for Robust Imbalanced Classification across Modalities [16.993547305381327]
Rebalancing with Calibrated Sub-classes (RCS) is a novel distribution calibration framework for robust imbalanced classification.<n>RCS fuses statistical information from the majority and intermediate class distributions via a weighted mixture of Gaussian components.
arXiv Detail & Related papers (2025-10-10T00:06:13Z) - SeMi: When Imbalanced Semi-Supervised Learning Meets Mining Hard Examples [54.760757107700755]
Semi-Supervised Learning (SSL) can leverage abundant unlabeled data to boost model performance.<n>The class-imbalanced data distribution in real-world scenarios poses great challenges to SSL, resulting in performance degradation.<n>We propose a method that enhances the performance of Imbalanced Semi-Supervised Learning by Mining Hard Examples (SeMi)
arXiv Detail & Related papers (2025-01-10T14:35:16Z) - Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning [6.904448748214652]
Semi-supervised learning algorithms struggle to perform well when exposed to imbalanced training data.
We introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL)
SEVAL adapts to specific tasks with improved pseudo-labels accuracy and ensures pseudo-labels correctness on a per-class basis.
arXiv Detail & Related papers (2024-07-07T13:46:22Z) - Gradient-based Sampling for Class Imbalanced Semi-supervised Object Detection [111.0991686509715]
We study the class imbalance problem for semi-supervised object detection (SSOD) under more challenging scenarios.
We propose a simple yet effective gradient-based sampling framework that tackles the class imbalance problem from the perspective of two types of confirmation biases.
Experiments on three proposed sub-tasks, namely MS-COCO, MS-COCO to Object365 and LVIS, suggest that our method outperforms current class imbalanced object detectors by clear margins.
arXiv Detail & Related papers (2024-03-22T11:30:10Z) - Twice Class Bias Correction for Imbalanced Semi-Supervised Learning [59.90429949214134]
We introduce a novel approach called textbfTwice textbfClass textbfBias textbfCorrection (textbfTCBC)
We estimate the class bias of the model parameters during the training process.
We apply a secondary correction to the model's pseudo-labels for unlabeled samples.
arXiv Detail & Related papers (2023-12-27T15:06:36Z) - Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration [18.376601653387315]
Longtailed semi-supervised learning (LTSSL) represents a practical scenario for semi-supervised applications.
This problem is often aggravated by discrepancies between labeled and unlabeled class distributions.
We introduce Flexible Distribution Alignment (FlexDA), a novel adaptive logit-adjusted loss framework.
arXiv Detail & Related papers (2023-06-07T17:50:59Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - CReST: A Class-Rebalancing Self-Training Framework for Imbalanced
Semi-Supervised Learning [15.671523625324388]
We propose Class-Rebalancing Self-Training (CReST) to improve existing SSL methods on class-imbalanced data.
CReST iteratively retrains a baseline SSL model with a labeled set expanded.
We show that CReST and CReST+ improve state-of-the-art SSL algorithms on various class-imbalanced datasets.
arXiv Detail & Related papers (2021-02-18T18:59:57Z) - Distribution Aligning Refinery of Pseudo-label for Imbalanced
Semi-supervised Learning [126.31716228319902]
We develop Distribution Aligning Refinery of Pseudo-label (DARP) algorithm.
We show that DARP is provably and efficiently compatible with state-of-the-art SSL schemes.
arXiv Detail & Related papers (2020-07-17T09:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.