Tripartite: Tackle Noisy Labels by a More Precise Partition
- URL: http://arxiv.org/abs/2202.09579v1
- Date: Sat, 19 Feb 2022 11:15:02 GMT
- Title: Tripartite: Tackle Noisy Labels by a More Precise Partition
- Authors: Xuefeng Liang, Longshan Yao, Xingyu Liu, Ying Zhou
- Abstract summary: We propose a Tripartite solution to partition training data more precisely into three subsets: hard, noisy, and clean.
To minimize the harm of noisy labels but maximize the value of noisy label data, we apply a low-weight learning on hard data and a self-supervised learning on noisy label data without using the given labels.
- Score: 21.582850128741022
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Samples in large-scale datasets may be mislabeled due to various reasons, and
Deep Neural Networks can easily over-fit to the noisy label data. To tackle
this problem, the key point is to alleviate the harm of these noisy labels.
Many existing methods try to divide training data into clean and noisy subsets
in terms of loss values, and then process the noisy label data varied. One of
the reasons hindering a better performance is the hard samples. As hard samples
always have relatively large losses whether their labels are clean or noisy,
these methods could not divide them precisely. Instead, we propose a Tripartite
solution to partition training data more precisely into three subsets: hard,
noisy, and clean. The partition criteria are based on the inconsistent
predictions of two networks, and the inconsistency between the prediction of a
network and the given label. To minimize the harm of noisy labels but maximize
the value of noisy label data, we apply a low-weight learning on hard data and
a self-supervised learning on noisy label data without using the given labels.
Extensive experiments demonstrate that Tripartite can filter out noisy label
data more precisely, and outperforms most state-of-the-art methods on five
benchmark datasets, especially on real-world datasets.
Related papers
- Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - FedNoisy: Federated Noisy Label Learning Benchmark [53.73816587601204]
Federated learning has gained popularity for distributed learning without aggregating sensitive data from clients.
The distributed and isolated nature of data isolation may be complicated by data quality, making it more vulnerable to noisy labels.
We serve the first standardized benchmark that can help researchers fully explore potential federated noisy settings.
arXiv Detail & Related papers (2023-06-20T16:18:14Z) - PARS: Pseudo-Label Aware Robust Sample Selection for Learning with Noisy
Labels [5.758073912084364]
We propose PARS: Pseudo-Label Aware Robust Sample Selection.
PARS exploits all training samples using both the raw/noisy labels and estimated/refurbished pseudo-labels via self-training.
Results show that PARS significantly outperforms the state of the art on extensive studies on the noisy CIFAR-10 and CIFAR-100 datasets.
arXiv Detail & Related papers (2022-01-26T09:31:55Z) - An Ensemble Noise-Robust K-fold Cross-Validation Selection Method for
Noisy Labels [0.9699640804685629]
Large-scale datasets tend to contain mislabeled samples that can be memorized by deep neural networks (DNNs)
We present Ensemble Noise-robust K-fold Cross-Validation Selection (E-NKCVS) to effectively select clean samples from noisy data.
We evaluate our approach on various image and text classification tasks where the labels have been manually corrupted with different noise ratios.
arXiv Detail & Related papers (2021-07-06T02:14:52Z) - INN: A Method Identifying Clean-annotated Samples via Consistency Effect
in Deep Neural Networks [1.1470070927586016]
We introduce a new method called INN to refine clean labeled data from training data with noisy labels.
The INN method requires more computation but is much stable and powerful than the small-loss strategy.
arXiv Detail & Related papers (2021-06-29T09:06:21Z) - Instance Correction for Learning with Open-set Noisy Labels [145.06552420999986]
We use the sample selection approach to handle open-set noisy labels.
The discarded data are seen to be mislabeled and do not participate in training.
We modify the instances of discarded data to make predictions for the discarded data consistent with given labels.
arXiv Detail & Related papers (2021-06-01T13:05:55Z) - Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels.
We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly.
This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z) - Error-Bounded Correction of Noisy Labels [17.510654621245656]
We show that the prediction of a noisy classifier can indeed be a good indicator of whether the label of a training data is clean.
Based on the theoretical result, we propose a novel algorithm that corrects the labels based on the noisy classifier prediction.
We incorporate our label correction algorithm into the training of deep neural networks and train models that achieve superior testing performance on multiple public datasets.
arXiv Detail & Related papers (2020-11-19T19:23:23Z) - Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels.
Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z) - Label Noise Types and Their Effects on Deep Learning [0.0]
In this work, we provide a detailed analysis of the effects of different kinds of label noise on learning.
We propose a generic framework to generate feature-dependent label noise, which we show to be the most challenging case for learning.
For the ease of other researchers to test their algorithms with noisy labels, we share corrupted labels for the most commonly used benchmark datasets.
arXiv Detail & Related papers (2020-03-23T18:03:39Z) - Multi-Class Classification from Noisy-Similarity-Labeled Data [98.13491369929798]
We propose a method for learning from only noisy-similarity-labeled data.
We use a noise transition matrix to bridge the class-posterior probability between clean and noisy data.
We build a novel learning system which can assign noise-free class labels for instances.
arXiv Detail & Related papers (2020-02-16T05:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.