Related papers: Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

Learning with Instance-Dependent Label Noise: A Sample Sieve Approach

URL: http://arxiv.org/abs/2010.02347v2
Date: Mon, 22 Mar 2021 22:01:05 GMT
Title: Learning with Instance-Dependent Label Noise: A Sample Sieve Approach
Authors: Hao Cheng, Zhaowei Zhu, Xingyu Li, Yifei Gong, Xing Sun, Yang Liu
Abstract summary: Human-annotated labels are often prone to noise. The presence of such noise will degrade the performance of the resulting deep neural network (DNN) models. We propose CORES$2$, which progressively sieves out corrupted examples.
Score: 24.143469284851456
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Human-annotated labels are often prone to noise, and the presence of such noise will degrade the performance of the resulting deep neural network (DNN) models. Much of the literature (with several recent exceptions) of learning with noisy labels focuses on the case when the label noise is independent of features. Practically, annotations errors tend to be instance-dependent and often depend on the difficulty levels of recognizing a certain task. Applying existing results from instance-independent settings would require a significant amount of estimation of noise rates. Therefore, providing theoretically rigorous solutions for learning with instance-dependent label noise remains a challenge. In this paper, we propose CORES$^{2}$ (COnfidence REgularized Sample Sieve), which progressively sieves out corrupted examples. The implementation of CORES$^{2}$ does not require specifying noise rates and yet we are able to provide theoretical guarantees of CORES$^{2}$ in filtering out the corrupted examples. This high-quality sample sieve allows us to treat clean examples and the corrupted ones separately in training a DNN solution, and such a separation is shown to be advantageous in the instance-dependent noise setting. We demonstrate the performance of CORES$^{2}$ on CIFAR10 and CIFAR100 datasets with synthetic instance-dependent label noise and Clothing1M with real-world human noise. As of independent interests, our sample sieve provides a generic machinery for anatomizing noisy datasets and provides a flexible interface for various robust training techniques to further improve the performance. Code is available at https://github.com/UCSC-REAL/cores.

Related papers

Mitigating Instance-Dependent Label Noise: Integrating Self-Supervised Pretraining with Pseudo-Label Refinement [3.272177633069322]
Real-world datasets often contain noisy labels due to human error, ambiguity, or resource constraints during the annotation process. We propose a novel framework that combines self-supervised learning using SimCLR with iterative pseudo-label refinement. Our approach significantly outperforms several state-of-the-art methods, particularly under high noise conditions.
arXiv Detail & Related papers (2024-12-06T09:56:49Z)
Binary Classification with Instance and Label Dependent Label Noise [4.061135251278187]
We show that learning with noisy samples is impossible without access to clean samples or strong assumptions on the distribution of the data. Our findings suggest that learning solely with noisy samples is impossible without access to clean samples or strong assumptions on the distribution of the data.
arXiv Detail & Related papers (2023-06-06T04:47:44Z)
Learning from Noisy Labels with Coarse-to-Fine Sample Credibility Modeling [22.62790706276081]
Training deep neural network (DNN) with noisy labels is practically challenging. Previous efforts tend to handle part or full data in a unified denoising flow. We propose a coarse-to-fine robust learning method called CREMA to handle noisy data in a divide-and-conquer manner.
arXiv Detail & Related papers (2022-08-23T02:06:38Z)
Label-Noise Learning with Intrinsically Long-Tailed Data [65.41318436799993]
We propose a learning framework for label-noise learning with intrinsically long-tailed data. Specifically, we propose two-stage bi-dimensional sample selection (TABASCO) to better separate clean samples from noisy samples.
arXiv Detail & Related papers (2022-08-21T07:47:05Z)
Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT) Most de-noising methods fail to identify the hard noises. We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z)
UNICON: Combating Label Noise Through Uniform Selection and Contrastive Learning [89.56465237941013]
We propose UNICON, a simple yet effective sample selection method which is robust to high label noise. We obtain an 11.4% improvement over the current state-of-the-art on CIFAR100 dataset with a 90% noise rate.
arXiv Detail & Related papers (2022-03-28T07:36:36Z)
Open-set Label Noise Can Improve Robustness Against Inherent Label Noise [27.885927200376386]
We show that open-set noisy labels can be non-toxic and even benefit the robustness against inherent noisy labels. We propose a simple yet effective regularization by introducing Open-set samples with Dynamic Noisy Labels (ODNL) into training.
arXiv Detail & Related papers (2021-06-21T07:15:50Z)
Tackling Instance-Dependent Label Noise via a Universal Probabilistic Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances. Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z)
A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks. We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
Confidence Scores Make Instance-dependent Label-noise Learning Possible [129.84497190791103]
In learning with noisy labels, for every instance, its label can randomly walk to other classes following a transition distribution which is named a noise model. We introduce confidence-scored instance-dependent noise (CSIDN), where each instance-label pair is equipped with a confidence score. We find with the help of confidence scores, the transition distribution of each instance can be approximately estimated.
arXiv Detail & Related papers (2020-01-11T16:15:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.