Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration
- URL: http://arxiv.org/abs/2306.04621v3
- Date: Mon, 15 Jul 2024 13:07:02 GMT
- Title: Flexible Distribution Alignment: Towards Long-tailed Semi-supervised Learning with Proper Calibration
- Authors: Emanuel Sanchez Aimar, Nathaniel Helgesen, Yonghao Xu, Marco Kuhlmann, Michael Felsberg,
- Abstract summary: Longtailed semi-supervised learning (LTSSL) represents a practical scenario for semi-supervised applications.
This problem is often aggravated by discrepancies between labeled and unlabeled class distributions.
We introduce Flexible Distribution Alignment (FlexDA), a novel adaptive logit-adjusted loss framework.
- Score: 18.376601653387315
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Long-tailed semi-supervised learning (LTSSL) represents a practical scenario for semi-supervised applications, challenged by skewed labeled distributions that bias classifiers. This problem is often aggravated by discrepancies between labeled and unlabeled class distributions, leading to biased pseudo-labels, neglect of rare classes, and poorly calibrated probabilities. To address these issues, we introduce Flexible Distribution Alignment (FlexDA), a novel adaptive logit-adjusted loss framework designed to dynamically estimate and align predictions with the actual distribution of unlabeled data and achieve a balanced classifier by the end of training. FlexDA is further enhanced by a distillation-based consistency loss, promoting fair data usage across classes and effectively leveraging underconfident samples. This method, encapsulated in ADELLO (Align and Distill Everything All at Once), proves robust against label shift, significantly improves model calibration in LTSSL contexts, and surpasses previous state-of-of-art approaches across multiple benchmarks, including CIFAR100-LT, STL10-LT, and ImageNet127, addressing class imbalance challenges in semi-supervised learning. Our code is available at https://github.com/emasa/ADELLO-LTSSL.
Related papers
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - Towards the Mitigation of Confirmation Bias in Semi-supervised Learning: a Debiased Training Perspective [6.164100243945264]
Semi-supervised learning (SSL) commonly exhibits confirmation bias, where models disproportionately favor certain classes.
We introduce TaMatch, a unified framework for debiased training in SSL.
We show that TaMatch significantly outperforms existing state-of-the-art methods across a range of challenging image classification tasks.
arXiv Detail & Related papers (2024-09-26T21:50:30Z) - Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning [6.904448748214652]
Semi-supervised learning algorithms struggle to perform well when exposed to imbalanced training data.
We introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL)
SEVAL adapts to specific tasks with improved pseudo-labels accuracy and ensures pseudo-labels correctness on a per-class basis.
arXiv Detail & Related papers (2024-07-07T13:46:22Z) - Twice Class Bias Correction for Imbalanced Semi-Supervised Learning [59.90429949214134]
We introduce a novel approach called textbfTwice textbfClass textbfBias textbfCorrection (textbfTCBC)
We estimate the class bias of the model parameters during the training process.
We apply a secondary correction to the model's pseudo-labels for unlabeled samples.
arXiv Detail & Related papers (2023-12-27T15:06:36Z) - Learning in Imperfect Environment: Multi-Label Classification with
Long-Tailed Distribution and Partial Labels [53.68653940062605]
We introduce a novel task, Partial labeling and Long-Tailed Multi-Label Classification (PLT-MLC)
We find that most LT-MLC and PL-MLC approaches fail to solve the degradation-MLC.
We propose an end-to-end learning framework: textbfCOrrection $rightarrow$ textbfModificattextbfIon $rightarrow$ balantextbfCe.
arXiv Detail & Related papers (2023-04-20T20:05:08Z) - BiSTF: Bilateral-Branch Self-Training Framework for Semi-Supervised
Large-scale Fine-Grained Recognition [28.06659482245647]
Semi-supervised Fine-Grained Recognition is a challenge task due to data imbalance, high interclass similarity and domain mismatch.
We propose Bilateral-Branch Self-Training Framework (BiSTF) to improve existing semi-balanced and domain-shifted fine-grained data.
We show BiSTF outperforms the existing state-of-the-art SSL on Semi-iNat dataset.
arXiv Detail & Related papers (2021-07-14T15:28:54Z) - Distribution-Aware Semantics-Oriented Pseudo-label for Imbalanced
Semi-Supervised Learning [80.05441565830726]
This paper addresses imbalanced semi-supervised learning, where heavily biased pseudo-labels can harm the model performance.
We propose a general pseudo-labeling framework to address the bias motivated by this observation.
We term the novel pseudo-labeling framework for imbalanced SSL as Distribution-Aware Semantics-Oriented (DASO) Pseudo-label.
arXiv Detail & Related papers (2021-06-10T11:58:25Z) - PLM: Partial Label Masking for Imbalanced Multi-label Classification [59.68444804243782]
Neural networks trained on real-world datasets with long-tailed label distributions are biased towards frequent classes and perform poorly on infrequent classes.
We propose a method, Partial Label Masking (PLM), which utilizes this ratio during training.
Our method achieves strong performance when compared to existing methods on both multi-label (MultiMNIST and MSCOCO) and single-label (imbalanced CIFAR-10 and CIFAR-100) image classification datasets.
arXiv Detail & Related papers (2021-05-22T18:07:56Z) - Distribution Aligning Refinery of Pseudo-label for Imbalanced
Semi-supervised Learning [126.31716228319902]
We develop Distribution Aligning Refinery of Pseudo-label (DARP) algorithm.
We show that DARP is provably and efficiently compatible with state-of-the-art SSL schemes.
arXiv Detail & Related papers (2020-07-17T09:16:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.