Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
- URL: http://arxiv.org/abs/2503.13915v2
- Date: Sun, 27 Apr 2025 08:32:33 GMT
- Title: Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization
- Authors: Dongkwan Lee, Kyomin Hwang, Nojun Kwak,
- Abstract summary: We propose a method for incorporating the unconfident-unlabeled samples that were previously disregarded in SSDG setting.<n>We show that our approach consistently improves performance when attached to baselines and outperforms competing plug-and-play methods.<n>We also analyze the role of our method in SSDG, showing that it enhances class-level discriminability and mitigates domain gaps.
- Score: 26.240518216121487
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We address the problem of semi-supervised domain generalization (SSDG), where the distributions of train and test data differ, and only a small amount of labeled data along with a larger amount of unlabeled data are available during training. Existing SSDG methods that leverage only the unlabeled samples for which the model's predictions are highly confident (confident-unlabeled samples), limit the full utilization of the available unlabeled data. To the best of our knowledge, we are the first to explore a method for incorporating the unconfident-unlabeled samples that were previously disregarded in SSDG setting. To this end, we propose UPCSC to utilize these unconfident-unlabeled samples in SSDG that consists of two modules: 1) Unlabeled Proxy-based Contrastive learning (UPC) module, treating unconfident-unlabeled samples as additional negative pairs and 2) Surrogate Class learning (SC) module, generating positive pairs for unconfident-unlabeled samples using their confusing class set. These modules are plug-and-play and do not require any domain labels, which can be easily integrated into existing approaches. Experiments on four widely used SSDG benchmarks demonstrate that our approach consistently improves performance when attached to baselines and outperforms competing plug-and-play methods. We also analyze the role of our method in SSDG, showing that it enhances class-level discriminability and mitigates domain gaps. The code is available at https://github.com/dongkwani/UPCSC.
Related papers
- Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-Mismatch [50.632535091877706]
Federated Semi-Supervised Learning (FSSL) aims to leverage unlabeled data across clients with limited labeled data to train a global model with strong generalization ability.<n>Most FSSL methods rely on consistency regularization with pseudo-labels, converting predictions from local or global models into hard pseudo-labels as supervisory signals.<n>We show that the quality of pseudo-label is largely deteriorated by data heterogeneity, an intrinsic facet of federated learning.
arXiv Detail & Related papers (2025-03-17T14:41:51Z) - A Unified Framework for Heterogeneous Semi-supervised Learning [26.87757610311636]
We introduce a novel problem setup termed Heterogeneous Semi-Supervised Learning (HSSL)
HSSL presents unique challenges by bridging the semi-supervised learning (SSL) task and the unsupervised domain adaptation (UDA) task.
We propose a novel method, Unified Framework for Heterogeneous Semi-supervised Learning (Uni-HSSL), to address HSSL.
arXiv Detail & Related papers (2025-03-01T01:32:02Z) - Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation [87.17768598044427]
Traditional semi-supervised learning assumes that the feature distributions of labeled and unlabeled data are consistent.
We propose Self-Supervised Feature Adaptation (SSFA), a generic framework for improving SSL performance when labeled and unlabeled data come from different distributions.
Our proposed SSFA is applicable to various pseudo-label-based SSL learners and significantly improves performance in labeled, unlabeled, and even unseen distributions.
arXiv Detail & Related papers (2024-05-31T03:13:45Z) - Towards Generalizing to Unseen Domains with Few Labels [7.002657345547741]
We aim to obtain a model that learns domain-generalizable features by leveraging a limited subset of labelled data.
Existing domain generalization (DG) methods which are unable to exploit unlabeled data perform poorly compared to semi-supervised learning (SSL) methods.
arXiv Detail & Related papers (2024-03-18T11:21:52Z) - Improving Pseudo-labelling and Enhancing Robustness for Semi-Supervised Domain Generalization [7.9776163947539755]
We study the problem of Semi-Supervised Domain Generalization which is crucial for real-world applications like automated healthcare.
We propose new SSDG approach, which utilizes a novel uncertainty-guided pseudo-labelling with model averaging.
Our uncertainty-guided pseudo-labelling (UPL) uses model uncertainty to improve pseudo-labelling selection, addressing poor model calibration under multi-source unlabelled data.
arXiv Detail & Related papers (2024-01-25T05:55:44Z) - Generalized Category Discovery with Clustering Assignment Consistency [56.92546133591019]
Generalized category discovery (GCD) is a recently proposed open-world task.
We propose a co-training-based framework that encourages clustering consistency.
Our method achieves state-of-the-art performance on three generic benchmarks and three fine-grained visual recognition datasets.
arXiv Detail & Related papers (2023-10-30T00:32:47Z) - JointMatch: A Unified Approach for Diverse and Collaborative
Pseudo-Labeling to Semi-Supervised Text Classification [65.268245109828]
Semi-supervised text classification (SSTC) has gained increasing attention due to its ability to leverage unlabeled data.
Existing approaches based on pseudo-labeling suffer from the issues of pseudo-label bias and error accumulation.
We propose JointMatch, a holistic approach for SSTC that addresses these challenges by unifying ideas from recent semi-supervised learning.
arXiv Detail & Related papers (2023-10-23T05:43:35Z) - OSSGAN: Open-Set Semi-Supervised Image Generation [26.67298827670573]
We introduce a challenging training scheme of conditional GANs, called open-set semi-supervised image generation.
OSSGAN provides decision clues to the discriminator on the basis of whether an unlabeled image belongs to one or none of the classes of interest.
The results of experiments on Tiny ImageNet and ImageNet show notable improvements over supervised BigGAN and semi-supervised methods.
arXiv Detail & Related papers (2022-04-29T17:26:09Z) - Semi-Supervised Domain Generalization with Stochastic StyleMatch [90.98288822165482]
In real-world applications, we might have only a few labels available from each source domain due to high annotation cost.
In this work, we investigate semi-supervised domain generalization, a more realistic and practical setting.
Our proposed approach, StyleMatch, is inspired by FixMatch, a state-of-the-art semi-supervised learning method based on pseudo-labeling.
arXiv Detail & Related papers (2021-06-01T16:00:08Z) - Domain Generalization via Semi-supervised Meta Learning [7.722498348924133]
We propose the first method of domain generalization to leverage unlabeled samples.
It is trained by a meta learning approach to mimic the distribution shift between the input source domains and unseen target domains.
Experimental results on benchmark datasets indicate that DG outperforms state-of-the-art domain generalization and semi-supervised learning methods.
arXiv Detail & Related papers (2020-09-26T18:05:04Z) - Federated Semi-Supervised Learning with Inter-Client Consistency &
Disjoint Learning [78.88007892742438]
We study two essential scenarios of Federated Semi-Supervised Learning (FSSL) based on the location of the labeled data.
We propose a novel method to tackle the problems, which we refer to as Federated Matching (FedMatch)
arXiv Detail & Related papers (2020-06-22T09:43:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.