Related papers: Efficient Adaptive Label Refinement for Label Noise Learning

Efficient Adaptive Label Refinement for Label Noise Learning

URL: http://arxiv.org/abs/2502.00386v1
Date: Sat, 01 Feb 2025 09:58:08 GMT
Title: Efficient Adaptive Label Refinement for Label Noise Learning
Authors: Wenzhen Zhang, Debo Cheng, Guangquan Lu, Bo Zhou, Jiaye Li, Shichao Zhang,
Abstract summary: We propose Adaptive Label Refinement (ALR) to avoid incorrect labels and thoroughly learning clean samples.<n>ALR is simple and efficient, requiring no prior knowledge of noise or auxiliary datasets.<n>We validate ALR's effectiveness through experiments on benchmark datasets with artificial label noise (CIFAR-10/100) and real-world datasets with inherent noise (ANIMAL-10N, Clothing1M, WebVision)
Score: 14.617885790129336
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep neural networks are highly susceptible to overfitting noisy labels, which leads to degraded performance. Existing methods address this issue by employing manually defined criteria, aiming to achieve optimal partitioning in each iteration to avoid fitting noisy labels while thoroughly learning clean samples. However, this often results in overly complex and difficult-to-train models. To address this issue, we decouple the tasks of avoiding fitting incorrect labels and thoroughly learning clean samples and propose a simple yet highly applicable method called Adaptive Label Refinement (ALR). First, inspired by label refurbishment techniques, we update the original hard labels to soft labels using the model's predictions to reduce the risk of fitting incorrect labels. Then, by introducing the entropy loss, we gradually `harden' the high-confidence soft labels, guiding the model to better learn from clean samples. This approach is simple and efficient, requiring no prior knowledge of noise or auxiliary datasets, making it more accessible compared to existing methods. We validate ALR's effectiveness through experiments on benchmark datasets with artificial label noise (CIFAR-10/100) and real-world datasets with inherent noise (ANIMAL-10N, Clothing1M, WebVision). The results show that ALR outperforms state-of-the-art methods.

Related papers

Optimal Labeler Assignment and Sampling for Active Learning in the Presence of Imperfect Labels [13.796089124499318]
We propose a novel AL framework to construct a robust classification model by minimizing noise levels.<n>Our approach includes an assignment model that optimally assigns query points to labelers, aiming to minimize the maximum possible noise within each cycle.<n>Our experiments demonstrate that our approach significantly improves classification performance compared to several benchmark methods.
arXiv Detail & Related papers (2025-12-14T23:06:37Z)
Mitigating Instance-Dependent Label Noise: Integrating Self-Supervised Pretraining with Pseudo-Label Refinement [3.272177633069322]
Real-world datasets often contain noisy labels due to human error, ambiguity, or resource constraints during the annotation process.<n>We propose a novel framework that combines self-supervised learning using SimCLR with iterative pseudo-label refinement.<n>Our approach significantly outperforms several state-of-the-art methods, particularly under high noise conditions.
arXiv Detail & Related papers (2024-12-06T09:56:49Z)
Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching. By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously. Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z)
Group Benefits Instances Selection for Data Purification [21.977432359384835]
Existing methods for combating label noise are typically designed and tested on synthetic datasets. We propose a method named GRIP to alleviate the noisy label problem for both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-03-23T03:06:19Z)
Label-Retrieval-Augmented Diffusion Models for Learning from Noisy Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications. In this paper, we reformulate the label-noise problem from a generative-model perspective. Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z)
Neighborhood Collective Estimation for Noisy Label Identification and Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels. Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias. We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z)
Sample Prior Guided Robust Model Learning to Suppress Noisy Labels [8.119439844514973]
We propose PGDF, a novel framework to learn a deep model to suppress noise by generating the samples' prior knowledge. Our framework can save more informative hard clean samples into the cleanly labeled set. We evaluate our method using synthetic datasets based on CIFAR-10 and CIFAR-100, as well as on the real-world datasets WebVision and Clothing1M.
arXiv Detail & Related papers (2021-12-02T13:09:12Z)
S3: Supervised Self-supervised Learning under Label Noise [53.02249460567745]
In this paper we address the problem of classification in the presence of label noise. In the heart of our method is a sample selection mechanism that relies on the consistency between the annotated label of a sample and the distribution of the labels in its neighborhood in the feature space. Our method significantly surpasses previous methods on both CIFARCIFAR100 with artificial noise and real-world noisy datasets such as WebVision and ANIMAL-10N.
arXiv Detail & Related papers (2021-11-22T15:49:20Z)
An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs) We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data. We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z)
Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
Learning to Purify Noisy Labels via Meta Soft Label Corrector [49.92310583232323]
Recent deep neural networks (DNNs) can easily overfit to biased training data with noisy labels. Label correction strategy is commonly used to alleviate this issue. We propose a meta-learning model which could estimate soft labels through meta-gradient descent step.
arXiv Detail & Related papers (2020-08-03T03:25:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.