Addressing Long-Tail Noisy Label Learning Problems: a Two-Stage Solution
with Label Refurbishment Considering Label Rarity
- URL: http://arxiv.org/abs/2403.02363v1
- Date: Mon, 4 Mar 2024 08:06:57 GMT
- Title: Addressing Long-Tail Noisy Label Learning Problems: a Two-Stage Solution
with Label Refurbishment Considering Label Rarity
- Authors: Ying-Hsuan Wu, Jun-Wei Hsieh, Li Xin, Shin-You Teng, Yi-Kuan Hsieh,
Ming-Ching Chang
- Abstract summary: We introduce an effective two-stage approach by combining soft-label refurbishing with multi-expert ensemble learning.
In the first stage of robust soft label refurbishing, we acquire unbiased features through contrastive learning.
In the second stage, our label refurbishment method is applied to obtain soft labels for multi-expert ensemble learning.
- Score: 13.490974408726323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Real-world datasets commonly exhibit noisy labels and class imbalance, such
as long-tailed distributions. While previous research addresses this issue by
differentiating noisy and clean samples, reliance on information from
predictions based on noisy long-tailed data introduces potential errors. To
overcome the limitations of prior works, we introduce an effective two-stage
approach by combining soft-label refurbishing with multi-expert ensemble
learning. In the first stage of robust soft label refurbishing, we acquire
unbiased features through contrastive learning, making preliminary predictions
using a classifier trained with a carefully designed BAlanced Noise-tolerant
Cross-entropy (BANC) loss. In the second stage, our label refurbishment method
is applied to obtain soft labels for multi-expert ensemble learning, providing
a principled solution to the long-tail noisy label problem. Experiments
conducted across multiple benchmarks validate the superiority of our approach,
Label Refurbishment considering Label Rarity (LR^2), achieving remarkable
accuracies of 94.19% and 77.05% on simulated noisy CIFAR-10 and CIFAR-100
long-tail datasets, as well as 77.74% and 81.40% on real-noise long-tail
datasets, Food-101N and Animal-10N, surpassing existing state-of-the-art
methods.
Related papers
- Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition [70.00984078351927]
This paper focuses on reducing noise based on some inherent properties of multi-label classification and long-tailed learning under noisy cases.
We propose a Stitch-Up augmentation to synthesize a cleaner sample, which directly reduces multi-label noise.
A Heterogeneous Co-Learning framework is further designed to leverage the inconsistency between long-tailed and balanced distributions.
arXiv Detail & Related papers (2023-07-03T09:20:28Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Reliable Label Correction is a Good Booster When Learning with Extremely
Noisy Labels [65.79898033530408]
We introduce a novel framework, termed as LC-Booster, to explicitly tackle learning under extreme noise.
LC-Booster incorporates label correction into the sample selection, so that more purified samples, through the reliable label correction, can be utilized for training.
Experiments show that LC-Booster advances state-of-the-art results on several noisy-label benchmarks.
arXiv Detail & Related papers (2022-04-30T07:19:03Z) - PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy
Labels [5.758073912084364]
We propose PARS: Pseudo-Label Aware Robust Sample Selection.
PARS exploits all training samples using both the raw/noisy labels and estimated/refurbished pseudo-labels via self-training.
Results show that PARS significantly outperforms the state of the art on extensive studies on the noisy CIFAR-10 and CIFAR-100 datasets.
arXiv Detail & Related papers (2022-01-26T09:31:55Z) - S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise.
In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space.
Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z) - Robust Long-Tailed Learning under Label Noise [50.00837134041317]
This work investigates the label noise problem under long-tailed label distribution.
We propose a robust framework,algo, that realizes noise detection for long-tailed learning.
Our framework can naturally leverage semi-supervised learning algorithms to further improve the generalisation.
arXiv Detail & Related papers (2021-08-26T03:45:00Z) - Learning From Long-Tailed Data With Noisy Labels [0.0]
Class imbalance and noisy labels are the norm in many large-scale classification datasets.
We present a simple two-stage approach based on recent advances in self-supervised learning.
We find that self-supervised learning approaches are effectively able to cope with severe class imbalance.
arXiv Detail & Related papers (2021-08-25T07:45:40Z) - Boosting Semi-Supervised Face Recognition with Noise Robustness [54.342992887966616]
This paper presents an effective solution to semi-supervised face recognition that is robust to the label noise aroused by the auto-labelling.
We develop a semi-supervised face recognition solution, named Noise Robust Learning-Labelling (NRoLL), which is based on the robust training ability empowered by GN.
arXiv Detail & Related papers (2021-05-10T14:43:11Z) - LongReMix: Robust Learning with High Confidence Samples in a Noisy Label
Environment [33.376639002442914]
We propose the new 2-stage noisy-label training algorithm LongReMix.
We test LongReMix on the noisy-label benchmarks CIFAR-10, CIFAR-100, WebVision, Clothing1M, and Food101-N.
Our approach achieves state-of-the-art performance in most datasets.
arXiv Detail & Related papers (2021-03-06T18:48:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.