Semi-supervised Long-tailed Recognition using Alternate Sampling
- URL: http://arxiv.org/abs/2105.00133v1
- Date: Sat, 1 May 2021 00:43:38 GMT
- Title: Semi-supervised Long-tailed Recognition using Alternate Sampling
- Authors: Bo Liu, Haoxiang Li, Hao Kang, Nuno Vasconcelos, Gang Hua
- Abstract summary: Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
- Score: 95.93760490301395
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Main challenges in long-tailed recognition come from the imbalanced data
distribution and sample scarcity in its tail classes. While techniques have
been proposed to achieve a more balanced training loss and to improve tail
classes data variations with synthesized samples, we resort to leverage readily
available unlabeled data to boost recognition accuracy. The idea leads to a new
recognition setting, namely semi-supervised long-tailed recognition. We argue
this setting better resembles the real-world data collection and annotation
process and hence can help close the gap to real-world scenarios. To address
the semi-supervised long-tailed recognition problem, we present an alternate
sampling framework combining the intuitions from successful methods in these
two research areas. The classifier and feature embedding are learned separately
and updated iteratively. The class-balanced sampling strategy has been
implemented to train the classifier in a way not affected by the pseudo labels'
quality on the unlabeled data. A consistency loss has been introduced to limit
the impact from unlabeled data while leveraging them to update the feature
embedding. We demonstrate significant accuracy improvements over other
competitive methods on two datasets.
Related papers
- Continuous Contrastive Learning for Long-Tailed Semi-Supervised Recognition [50.61991746981703]
Current state-of-the-art LTSSL approaches rely on high-quality pseudo-labels for large-scale unlabeled data.
This paper introduces a novel probabilistic framework that unifies various recent proposals in long-tail learning.
We introduce a continuous contrastive learning method, CCL, extending our framework to unlabeled data using reliable and smoothed pseudo-labels.
arXiv Detail & Related papers (2024-10-08T15:06:10Z) - Learning from Noisy Labels for Long-tailed Data via Optimal Transport [2.8821062918162146]
We propose a novel approach to manage data characterized by both long-tailed distributions and noisy labels.
We employ optimal transport strategies to generate pseudo-labels for the noise set in a semi-supervised training manner.
arXiv Detail & Related papers (2024-08-07T14:15:18Z) - Fairness Improves Learning from Noisily Labeled Long-Tailed Data [119.0612617460727]
Long-tailed and noisily labeled data frequently appear in real-world applications and impose significant challenges for learning.
We introduce the Fairness Regularizer (FR), inspired by regularizing the performance gap between any two sub-populations.
We show that the introduced fairness regularizer improves the performances of sub-populations on the tail and the overall learning performance.
arXiv Detail & Related papers (2023-03-22T03:46:51Z) - Learning with Noisy labels via Self-supervised Adversarial Noisy Masking [33.87292143223425]
We propose a novel training approach termed adversarial noisy masking.
It adaptively modulates the input data and label simultaneously, preventing the model to overfit noisy samples.
It is tested on both synthetic and real-world noisy datasets.
arXiv Detail & Related papers (2023-02-14T03:13:26Z) - SoftMatch: Addressing the Quantity-Quality Trade-off in Semi-supervised
Learning [101.86916775218403]
This paper revisits the popular pseudo-labeling methods via a unified sample weighting formulation.
We propose SoftMatch to overcome the trade-off by maintaining both high quantity and high quality of pseudo-labels during training.
In experiments, SoftMatch shows substantial improvements across a wide variety of benchmarks, including image, text, and imbalanced classification.
arXiv Detail & Related papers (2023-01-26T03:53:25Z) - Contrastive Regularization for Semi-Supervised Learning [46.020125061295886]
We propose contrastive regularization to improve both efficiency and accuracy of the consistency regularization by well-clustered features of unlabeled data.
Our method also shows robust performance on open-set semi-supervised learning where unlabeled data includes out-of-distribution samples.
arXiv Detail & Related papers (2022-01-17T07:20:11Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - BiSTF: Bilateral-Branch Self-Training Framework for Semi-Supervised
Large-scale Fine-Grained Recognition [28.06659482245647]
Semi-supervised Fine-Grained Recognition is a challenge task due to data imbalance, high interclass similarity and domain mismatch.
We propose Bilateral-Branch Self-Training Framework (BiSTF) to improve existing semi-balanced and domain-shifted fine-grained data.
We show BiSTF outperforms the existing state-of-the-art SSL on Semi-iNat dataset.
arXiv Detail & Related papers (2021-07-14T15:28:54Z) - Exploiting Sample Uncertainty for Domain Adaptive Person
Re-Identification [137.9939571408506]
We estimate and exploit the credibility of the assigned pseudo-label of each sample to alleviate the influence of noisy labels.
Our uncertainty-guided optimization brings significant improvement and achieves the state-of-the-art performance on benchmark datasets.
arXiv Detail & Related papers (2020-12-16T04:09:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.