Fine tuning Pre trained Models for Robustness Under Noisy Labels
- URL: http://arxiv.org/abs/2310.17668v1
- Date: Tue, 24 Oct 2023 20:28:59 GMT
- Title: Fine tuning Pre trained Models for Robustness Under Noisy Labels
- Authors: Sumyeong Ahn, Sihyeon Kim, Jongwoo Ko, Se-Young Yun
- Abstract summary: The presence of noisy labels in a training dataset can significantly impact the performance of machine learning models.
We introduce a novel algorithm called TURN, which robustly and efficiently transfers the prior knowledge of pre-trained models.
- Score: 34.68018860186995
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The presence of noisy labels in a training dataset can significantly impact
the performance of machine learning models. To tackle this issue, researchers
have explored methods for Learning with Noisy Labels to identify clean samples
and reduce the influence of noisy labels. However, constraining the influence
of a certain portion of the training dataset can result in a reduction in
overall generalization performance. To alleviate this, recent studies have
considered the careful utilization of noisy labels by leveraging huge
computational resources. Therefore, the increasing training cost necessitates a
reevaluation of efficiency. In other areas of research, there has been a focus
on developing fine-tuning techniques for large pre-trained models that aim to
achieve both high generalization performance and efficiency. However, these
methods have mainly concentrated on clean datasets, and there has been limited
exploration of the noisy label scenario. In this research, our aim is to find
an appropriate way to fine-tune pre-trained models for noisy labeled datasets.
To achieve this goal, we investigate the characteristics of pre-trained models
when they encounter noisy datasets. Through empirical analysis, we introduce a
novel algorithm called TURN, which robustly and efficiently transfers the prior
knowledge of pre-trained models. The algorithm consists of two main steps: (1)
independently tuning the linear classifier to protect the feature extractor
from being distorted by noisy labels, and (2) reducing the noisy label ratio
and fine-tuning the entire model based on the noise-reduced dataset to adapt it
to the target dataset. The proposed algorithm has been extensively tested and
demonstrates efficient yet improved denoising performance on various benchmarks
compared to previous methods.
Related papers
- Learning with Noisy Foundation Models [95.50968225050012]
This paper is the first work to comprehensively understand and analyze the nature of noise in pre-training datasets.
We propose a tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise and improve generalization.
arXiv Detail & Related papers (2024-03-11T16:22:41Z) - Stochastic Amortization: A Unified Approach to Accelerate Feature and Data Attribution [62.71425232332837]
We show that training amortized models with noisy labels is inexpensive and surprisingly effective.
This approach significantly accelerates several feature attribution and data valuation methods, often yielding an order of magnitude speedup over existing approaches.
arXiv Detail & Related papers (2024-01-29T03:42:37Z) - Understanding and Mitigating the Label Noise in Pre-training on
Downstream Tasks [91.15120211190519]
This paper aims to understand the nature of noise in pre-training datasets and to mitigate its impact on downstream tasks.
We propose a light-weight black-box tuning method (NMTune) to affine the feature space to mitigate the malignant effect of noise.
arXiv Detail & Related papers (2023-09-29T06:18:15Z) - Learning with Noisy Labels through Learnable Weighting and Centroid Similarity [5.187216033152917]
noisy labels are prevalent in domains such as medical diagnosis and autonomous driving.
We introduce a novel method for training machine learning models in the presence of noisy labels.
Our results show that our method consistently outperforms the existing state-of-the-art techniques.
arXiv Detail & Related papers (2023-03-16T16:43:24Z) - Learning with Noisy labels via Self-supervised Adversarial Noisy Masking [33.87292143223425]
We propose a novel training approach termed adversarial noisy masking.
It adaptively modulates the input data and label simultaneously, preventing the model to overfit noisy samples.
It is tested on both synthetic and real-world noisy datasets.
arXiv Detail & Related papers (2023-02-14T03:13:26Z) - On-the-fly Denoising for Data Augmentation in Natural Language
Understanding [101.46848743193358]
We propose an on-the-fly denoising technique for data augmentation that learns from soft augmented labels provided by an organic teacher model trained on the cleaner original data.
Our method can be applied to general augmentation techniques and consistently improve the performance on both text classification and question-answering tasks.
arXiv Detail & Related papers (2022-12-20T18:58:33Z) - Neighborhood Collective Estimation for Noisy Label Identification and
Correction [92.20697827784426]
Learning with noisy labels (LNL) aims at designing strategies to improve model performance and generalization by mitigating the effects of model overfitting to noisy labels.
Recent advances employ the predicted label distributions of individual samples to perform noise verification and noisy label correction, easily giving rise to confirmation bias.
We propose Neighborhood Collective Estimation, in which the predictive reliability of a candidate sample is re-estimated by contrasting it against its feature-space nearest neighbors.
arXiv Detail & Related papers (2022-08-05T14:47:22Z) - Towards Harnessing Feature Embedding for Robust Learning with Noisy
Labels [44.133307197696446]
The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods.
We propose a novel feature embedding-based method for deep learning with label noise, termed LabEl NoiseDilution (LEND)
arXiv Detail & Related papers (2022-06-27T02:45:09Z) - Robust Meta-learning with Sampling Noise and Label Noise via
Eigen-Reptile [78.1212767880785]
meta-learner is prone to overfitting since there are only a few available samples.
When handling the data with noisy labels, the meta-learner could be extremely sensitive to label noise.
We present Eigen-Reptile (ER) that updates the meta- parameters with the main direction of historical task-specific parameters.
arXiv Detail & Related papers (2022-06-04T08:48:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.