LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels
- URL: http://arxiv.org/abs/2307.16614v1
- Date: Mon, 31 Jul 2023 12:44:30 GMT
- Title: LaplaceConfidence: a Graph-based Approach for Learning with Noisy Labels
- Authors: Mingcai Chen, Yuntao Du, Wei Tang, Baoming Zhang, Hao Cheng, Shuwei
Qian, Chongjun Wang
- Abstract summary: We introduce LaplaceConfidence, a method to obtain label confidence (i.e., clean probabilities) utilizing the Laplacian energy.
LaplaceConfidence is embedded into a holistic method for robust training, where co-training technique generates unbiased label confidence.
Our experiments demonstrate that LaplaceConfidence outperforms state-of-the-art methods on benchmark datasets under both synthetic and real-world noise.
- Score: 17.66525177980075
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In real-world applications, perfect labels are rarely available, making it
challenging to develop robust machine learning algorithms that can handle noisy
labels. Recent methods have focused on filtering noise based on the discrepancy
between model predictions and given noisy labels, assuming that samples with
small classification losses are clean. This work takes a different approach by
leveraging the consistency between the learned model and the entire noisy
dataset using the rich representational and topological information in the
data. We introduce LaplaceConfidence, a method that to obtain label confidence
(i.e., clean probabilities) utilizing the Laplacian energy. Specifically, it
first constructs graphs based on the feature representations of all noisy
samples and minimizes the Laplacian energy to produce a low-energy graph. Clean
labels should fit well into the low-energy graph while noisy ones should not,
allowing our method to determine data's clean probabilities. Furthermore,
LaplaceConfidence is embedded into a holistic method for robust training, where
co-training technique generates unbiased label confidence and label
refurbishment technique better utilizes it. We also explore the dimensionality
reduction technique to accommodate our method on large-scale noisy datasets.
Our experiments demonstrate that LaplaceConfidence outperforms state-of-the-art
methods on benchmark datasets under both synthetic and real-world noise.
Related papers
- Correcting Noisy Multilabel Predictions: Modeling Label Noise through Latent Space Shifts [4.795811957412855]
Noise in data appears to be inevitable in most real-world machine learning applications.
We investigate the less explored area of noisy label learning for multilabel classifications.
Our model posits that label noise arises from a shift in the latent variable, providing a more robust and beneficial means for noisy learning.
arXiv Detail & Related papers (2025-02-20T05:41:52Z) - Efficient Adaptive Label Refinement for Label Noise Learning [14.617885790129336]
We propose Adaptive Label Refinement (ALR) to avoid incorrect labels and thoroughly learning clean samples.
ALR is simple and efficient, requiring no prior knowledge of noise or auxiliary datasets.
We validate ALR's effectiveness through experiments on benchmark datasets with artificial label noise (CIFAR-10/100) and real-world datasets with inherent noise (ANIMAL-10N, Clothing1M, WebVision)
arXiv Detail & Related papers (2025-02-01T09:58:08Z) - Learning from Noisy Labels for Long-tailed Data via Optimal Transport [2.8821062918162146]
We propose a novel approach to manage data characterized by both long-tailed distributions and noisy labels.
We employ optimal transport strategies to generate pseudo-labels for the noise set in a semi-supervised training manner.
arXiv Detail & Related papers (2024-08-07T14:15:18Z) - Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Rethinking Noisy Label Learning in Real-world Annotation Scenarios from
the Noise-type Perspective [38.24239397999152]
We propose a novel sample selection-based approach for noisy label learning, called Proto-semi.
Proto-semi divides all samples into the confident and unconfident datasets via warm-up.
By leveraging the confident dataset, prototype vectors are constructed to capture class characteristics.
Empirical evaluations on a real-world annotated dataset substantiate the robustness of Proto-semi in handling the problem of learning from noisy labels.
arXiv Detail & Related papers (2023-07-28T10:57:38Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.