Learning to Detect Noisy Labels Using Model-Based Features
- URL: http://arxiv.org/abs/2212.13767v1
- Date: Wed, 28 Dec 2022 10:12:13 GMT
- Title: Learning to Detect Noisy Labels Using Model-Based Features
- Authors: Zhihao Wang, Zongyu Lin, Peiqi Liu, Guidong ZHeng, Junjie Wen, Xianxin
Chen, Yujun Chen, Zhilin Yang
- Abstract summary: We propose Selection-Enhanced Noisy label Training (SENT)
SENT does not rely on meta learning while having the flexibility of being data-driven.
It improves performance over strong baselines under the settings of self-training and label corruption.
- Score: 16.681748918518075
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Label noise is ubiquitous in various machine learning scenarios such as
self-labeling with model predictions and erroneous data annotation. Many
existing approaches are based on heuristics such as sample losses, which might
not be flexible enough to achieve optimal solutions. Meta learning based
methods address this issue by learning a data selection function, but can be
hard to optimize. In light of these pros and cons, we propose
Selection-Enhanced Noisy label Training (SENT) that does not rely on meta
learning while having the flexibility of being data-driven. SENT transfers the
noise distribution to a clean set and trains a model to distinguish noisy
labels from clean ones using model-based features. Empirically, on a wide range
of tasks including text classification and speech recognition, SENT improves
performance over strong baselines under the settings of self-training and label
corruption.
Related papers
- Vision-Language Models are Strong Noisy Label Detectors [76.07846780815794]
This paper presents a Denoising Fine-Tuning framework, called DeFT, for adapting vision-language models.
DeFT utilizes the robust alignment of textual and visual features pre-trained on millions of auxiliary image-text pairs to sieve out noisy labels.
Experimental results on seven synthetic and real-world noisy datasets validate the effectiveness of DeFT in both noisy label detection and image classification.
arXiv Detail & Related papers (2024-09-29T12:55:17Z) - Foster Adaptivity and Balance in Learning with Noisy Labels [26.309508654960354]
We propose a novel approach named textbfSED to deal with label noise in a textbfSelf-adaptivtextbfE and class-balancetextbfD manner.
A mean-teacher model is then employed to correct labels of noisy samples.
We additionally propose a self-adaptive and class-balanced sample re-weighting mechanism to assign different weights to detected noisy samples.
arXiv Detail & Related papers (2024-07-03T03:10:24Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Label-Retrieval-Augmented Diffusion Models for Learning from Noisy
Labels [61.97359362447732]
Learning from noisy labels is an important and long-standing problem in machine learning for real applications.
In this paper, we reformulate the label-noise problem from a generative-model perspective.
Our model achieves new state-of-the-art (SOTA) results on all the standard real-world benchmark datasets.
arXiv Detail & Related papers (2023-05-31T03:01:36Z) - Towards Harnessing Feature Embedding for Robust Learning with Noisy
Labels [44.133307197696446]
The memorization effect of deep neural networks (DNNs) plays a pivotal role in recent label noise learning methods.
We propose a novel feature embedding-based method for deep learning with label noise, termed LabEl NoiseDilution (LEND)
arXiv Detail & Related papers (2022-06-27T02:45:09Z) - UNICON: Combating Label Noise Through Uniform Selection and Contrastive
Learning [89.56465237941013]
We propose UNICON, a simple yet effective sample selection method which is robust to high label noise.
We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
arXiv Detail & Related papers (2022-03-28T07:36:36Z) - Dash: Semi-Supervised Learning with Dynamic Thresholding [72.74339790209531]
We propose a semi-supervised learning (SSL) approach that uses unlabeled examples to train models.
Our proposed approach, Dash, enjoys its adaptivity in terms of unlabeled data selection.
arXiv Detail & Related papers (2021-09-01T23:52:29Z) - Learning from Noisy Labels for Entity-Centric Information Extraction [17.50856935207308]
We propose a simple co-regularization framework for entity-centric information extraction.
These models are jointly optimized with task-specific loss, and are regularized to generate similar predictions.
In the end, we can take any of the trained models for inference.
arXiv Detail & Related papers (2021-04-17T22:49:12Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.