Related papers: Early Stopping Against Label Noise Without Validation Data

Early Stopping Against Label Noise Without Validation Data

URL: http://arxiv.org/abs/2502.07551v1
Date: Tue, 11 Feb 2025 13:40:15 GMT
Title: Early Stopping Against Label Noise Without Validation Data
Authors: Suqin Yuan, Lei Feng, Tongliang Liu,
Abstract summary: We propose a novel early stopping method called Label Wave, which does not require validation data for selecting the desired model.<n>We show both the effectiveness of the Label Wave method across various settings and its capability to enhance the performance of existing methods for learning with noisy labels.
Score: 54.27621957395026
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Early stopping methods in deep learning face the challenge of balancing the volume of training and validation data, especially in the presence of label noise. Concretely, sparing more data for validation from training data would limit the performance of the learned model, yet insufficient validation data could result in a sub-optimal selection of the desired model. In this paper, we propose a novel early stopping method called Label Wave, which does not require validation data for selecting the desired model in the presence of label noise. It works by tracking the changes in the model's predictions on the training set during the training process, aiming to halt training before the model unduly fits mislabeled data. This method is empirically supported by our observation that minimum fluctuations in predictions typically occur at the training epoch before the model excessively fits mislabeled data. Through extensive experiments, we show both the effectiveness of the Label Wave method across various settings and its capability to enhance the performance of existing methods for learning with noisy labels.

Related papers

Classifying Long-tailed and Label-noise Data via Disentangling and Unlearning [58.052712054684946]
In real-world datasets, the challenges of long-tailed distributions and noisy labels often coexist. We propose a novel method called Disentangling and Unlearning for Long-tailed and Label-noisy data.
arXiv Detail & Related papers (2025-03-14T13:58:27Z)
Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning [89.98353600316285]
We introduce uncertainty into the modeling process for pseudo-label sampling, taking into account that the model performance on the tailed classes varies over different training stages. This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes. Compared to other methods such as the baseline method FixMatch, UDTS achieves an increase in accuracy of at least approximately 5.26%, 1.75%, 9.96%, and 1.28% on the natural scene image datasets.
arXiv Detail & Related papers (2024-01-09T08:59:39Z)
Learning in the Wild: Towards Leveraging Unlabeled Data for Effectively Tuning Pre-trained Code Models [38.7352992942213]
We propose a novel approach named HINT to improve pre-trained code models with large-scale unlabeled datasets. HINT includes two main modules: HybrId pseudo-labeled data selection and Noise-tolerant Training. The experimental results show that HINT can better leverage those unlabeled data in a task-specific way.
arXiv Detail & Related papers (2024-01-02T06:39:00Z)
Understanding and Mitigating the Label Noise in Pre-training on Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks. We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z)
Boosting Semi-Supervised Learning by bridging high and low-confidence predictions [4.18804572788063]
Pseudo-labeling is a crucial technique in semi-supervised learning (SSL) We propose a new method called ReFixMatch, which aims to utilize all of the unlabeled data during training.
arXiv Detail & Related papers (2023-08-15T00:27:18Z)
Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training. We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data. Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z)
Rethinking Precision of Pseudo Label: Test-Time Adaptation via Complementary Learning [10.396596055773012]
We propose a novel complementary learning approach to enhance test-time adaptation. In test-time adaptation tasks, information from the source domain is typically unavailable. We highlight that the risk function of complementary labels agrees with their Vanilla loss formula.
arXiv Detail & Related papers (2023-01-15T03:36:33Z)
Self-Supervised Noisy Label Learning for Source-Free Unsupervised Domain Adaptation [87.60688582088194]
We propose a novel Self-Supervised Noisy Label Learning method. Our method can easily achieve state-of-the-art results and surpass other methods by a very large margin.
arXiv Detail & Related papers (2021-02-23T10:51:45Z)
Improving Generalization of Deep Fault Detection Models in the Presence of Mislabeled Data [1.3535770763481902]
We propose a novel two-step framework for robust training with label noise. In the first step, we identify outliers (including the mislabeled samples) based on the update in the hypothesis space. In the second step, we propose different approaches to modifying the training data based on the identified outliers and a data augmentation technique.
arXiv Detail & Related papers (2020-09-30T12:33:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.