Transfer and Share: Semi-Supervised Learning from Long-Tailed Data
- URL: http://arxiv.org/abs/2205.13358v1
- Date: Thu, 26 May 2022 13:37:59 GMT
- Title: Transfer and Share: Semi-Supervised Learning from Long-Tailed Data
- Authors: Tong Wei, Qian-Yu Liu, Jiang-Xin Shi, Wei-Wei Tu, Lan-Zhe Guo
- Abstract summary: We present the TRAS (TRAnsfer and Share) to effectively utilize long-tailed semi-supervised data.
TRAS transforms the imbalanced pseudo-label distribution of a traditional SSL model.
It then transfers the distribution to a target model such that the minority class will receive significant attention.
- Score: 27.88381366842497
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Long-Tailed Semi-Supervised Learning (LTSSL) aims to learn from
class-imbalanced data where only a few samples are annotated. Existing
solutions typically require substantial cost to solve complex optimization
problems, or class-balanced undersampling which can result in information loss.
In this paper, we present the TRAS (TRAnsfer and Share) to effectively utilize
long-tailed semi-supervised data. TRAS transforms the imbalanced pseudo-label
distribution of a traditional SSL model via a delicate function to enhance the
supervisory signals for minority classes. It then transfers the distribution to
a target model such that the minority class will receive significant attention.
Interestingly, TRAS shows that more balanced pseudo-label distribution can
substantially benefit minority-class training, instead of seeking to generate
accurate pseudo-labels as in previous works. To simplify the approach, TRAS
merges the training of the traditional SSL model and the target model into a
single procedure by sharing the feature extractor, where both classifiers help
improve the representation learning. According to extensive experiments, TRAS
delivers much higher accuracy than state-of-the-art methods in the entire set
of classes as well as minority classes.
Related papers
- Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective [6.164100243945264]
Semi-supervised learning (SSL) commonly exhibits confirmation bias, where models disproportionately favor certain classes.
We introduce TaMatch, a unified framework for debiased training in SSL.
We show that TaMatch significantly outperforms existing state-of-the-art methods across a range of challenging image classification tasks.
arXiv Detail & Related papers (2024-09-26T21:50:30Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Exploring Vacant Classes in Label-Skewed Federated Learning [113.65301899666645]
Label skews, characterized by disparities in local label distribution across clients, pose a significant challenge in federated learning.
This paper introduces FedVLS, a novel approach to label-skewed federated learning that integrates vacant-class distillation and logit suppression simultaneously.
arXiv Detail & Related papers (2024-01-04T16:06:31Z) - Progressive Feature Adjustment for Semi-supervised Learning from
Pretrained Models [39.42802115580677]
Semi-supervised learning (SSL) can leverage both labeled and unlabeled data to build a predictive model.
Recent literature suggests that naively applying state-of-the-art SSL with a pretrained model fails to unleash the full potential of training data.
We propose to use pseudo-labels from the unlabelled data to update the feature extractor that is less sensitive to incorrect labels.
arXiv Detail & Related papers (2023-09-09T01:57:14Z) - Pruning the Unlabeled Data to Improve Semi-Supervised Learning [17.62242617965356]
We present PruneSSL, a technique for selectively removing examples from the original unlabeled dataset to enhance its separability.
Although PruneSSL reduces the quantity of available training data for the learner, it significantly improves the performance of various competitive SSL algorithms.
arXiv Detail & Related papers (2023-08-27T09:45:41Z) - An Embarrassingly Simple Baseline for Imbalanced Semi-Supervised
Learning [103.65758569417702]
Semi-supervised learning (SSL) has shown great promise in leveraging unlabeled data to improve model performance.
We consider a more realistic and challenging setting called imbalanced SSL, where imbalanced class distributions occur in both labeled and unlabeled data.
We study a simple yet overlooked baseline -- SimiS -- which tackles data imbalance by simply supplementing labeled data with pseudo-labels.
arXiv Detail & Related papers (2022-11-20T21:18:41Z) - Adaptive Distribution Calibration for Few-Shot Learning with
Hierarchical Optimal Transport [78.9167477093745]
We propose a novel distribution calibration method by learning the adaptive weight matrix between novel samples and base classes.
Experimental results on standard benchmarks demonstrate that our proposed plug-and-play model outperforms competing approaches.
arXiv Detail & Related papers (2022-10-09T02:32:57Z) - Self-supervised Learning is More Robust to Dataset Imbalance [65.84339596595383]
We investigate self-supervised learning under dataset imbalance.
Off-the-shelf self-supervised representations are already more robust to class imbalance than supervised representations.
We devise a re-weighted regularization technique that consistently improves the SSL representation quality on imbalanced datasets.
arXiv Detail & Related papers (2021-10-11T06:29:56Z) - Rethinking Re-Sampling in Imbalanced Semi-Supervised Learning [26.069534478556527]
Semi-Supervised Learning (SSL) has shown its strong ability in utilizing unlabeled data when labeled data is scarce.
Most SSL algorithms work under the assumption that the class distributions are balanced in both training and test sets.
In this work, we consider the problem of SSL on class-imbalanced data, which better reflects real-world situations.
arXiv Detail & Related papers (2021-06-01T03:58:18Z) - Imbalanced Data Learning by Minority Class Augmentation using Capsule
Adversarial Networks [31.073558420480964]
We propose a method to restore the balance in imbalanced images, by coalescing two concurrent methods.
In our model, generative and discriminative networks play a novel competitive game.
The coalescing of capsule-GAN is effective at recognizing highly overlapping classes with much fewer parameters compared with the convolutional-GAN.
arXiv Detail & Related papers (2020-04-05T12:36:06Z) - M2m: Imbalanced Classification via Major-to-minor Translation [79.09018382489506]
In most real-world scenarios, labeled training datasets are highly class-imbalanced, where deep neural networks suffer from generalizing to a balanced testing criterion.
In this paper, we explore a novel yet simple way to alleviate this issue by augmenting less-frequent classes via translating samples from more-frequent classes.
Our experimental results on a variety of class-imbalanced datasets show that the proposed method improves the generalization on minority classes significantly compared to other existing re-sampling or re-weighting methods.
arXiv Detail & Related papers (2020-04-01T13:21:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.