The Dynamic of Consensus in Deep Networks and the Identification of
Noisy Labels
- URL: http://arxiv.org/abs/2210.00583v1
- Date: Sun, 2 Oct 2022 17:47:23 GMT
- Title: The Dynamic of Consensus in Deep Networks and the Identification of
Noisy Labels
- Authors: Daniel Shwartz and Uri Stern and Daphna Weinshall
- Abstract summary: noisy labels can't be distinguished from clean examples by the end of training.
Recent research has dealt with this challenge by utilizing the fact that deep networks seem to memorize examples much earlier than noisy examples.
We use this observation to develop a new method for noisy labels filtration.
- Score: 5.28539620288341
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Deep neural networks have incredible capacity and expressibility, and can
seemingly memorize any training set. This introduces a problem when training in
the presence of noisy labels, as the noisy examples cannot be distinguished
from clean examples by the end of training. Recent research has dealt with this
challenge by utilizing the fact that deep networks seem to memorize clean
examples much earlier than noisy examples. Here we report a new empirical
result: for each example, when looking at the time it has been memorized by
each model in an ensemble of networks, the diversity seen in noisy examples is
much larger than the clean examples. We use this observation to develop a new
method for noisy labels filtration. The method is based on a statistics of the
data, which captures the differences in ensemble learning dynamics between
clean and noisy data. We test our method on three tasks: (i) noise amount
estimation; (ii) noise filtration; (iii) supervised classification. We show
that our method improves over existing baselines in all three tasks using a
variety of datasets, noise models, and noise levels. Aside from its improved
performance, our method has two other advantages. (i) Simplicity, which implies
that no additional hyperparameters are introduced. (ii) Our method is modular:
it does not work in an end-to-end fashion, and can therefore be used to clean a
dataset for any other future usage.
Related papers
- Multiclass Learning from Noisy Labels for Non-decomposable Performance Measures [15.358504449550013]
We design algorithms to learn from noisy labels for two broad classes of non-decomposable performance measures.
In both cases, we develop noise-corrected versions of the algorithms under the widely studied class-conditional noise models.
Our experiments demonstrate the effectiveness of our algorithms in handling label noise.
arXiv Detail & Related papers (2024-02-01T23:03:53Z) - Combating Label Noise With A General Surrogate Model For Sample
Selection [84.61367781175984]
We propose to leverage the vision-language surrogate model CLIP to filter noisy samples automatically.
We validate the effectiveness of our proposed method on both real-world and synthetic noisy datasets.
arXiv Detail & Related papers (2023-10-16T14:43:27Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Co-Learning Meets Stitch-Up for Noisy Multi-label Visual Recognition [70.00984078351927]
This paper focuses on reducing noise based on some inherent properties of multi-label classification and long-tailed learning under noisy cases.
We propose a Stitch-Up augmentation to synthesize a cleaner sample, which directly reduces multi-label noise.
A Heterogeneous Co-Learning framework is further designed to leverage the inconsistency between long-tailed and balanced distributions.
arXiv Detail & Related papers (2023-07-03T09:20:28Z) - MILD: Modeling the Instance Learning Dynamics for Learning with Noisy
Labels [19.650299232829546]
We propose an iterative selection approach based on the Weibull mixture model to identify clean data.
In particular, we measure the difficulty of memorization and memorize for each instance via the transition times between being misclassified and being memorized.
Our strategy outperforms existing noisy-label learning methods.
arXiv Detail & Related papers (2023-06-20T14:26:53Z) - Identifying Hard Noise in Long-Tailed Sample Distribution [76.16113794808001]
We introduce Noisy Long-Tailed Classification (NLT)
Most de-noising methods fail to identify the hard noises.
We design an iterative noisy learning framework called Hard-to-Easy (H2E)
arXiv Detail & Related papers (2022-07-27T09:03:03Z) - Robust Training under Label Noise by Over-parameterization [41.03008228953627]
We propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted.
The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data.
Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets.
arXiv Detail & Related papers (2022-02-28T18:50:10Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Exponentiated Gradient Reweighting for Robust Training Under Label Noise
and Beyond [21.594200327544968]
We present a flexible approach to learning from noisy examples.
Specifically, we treat each training example as an expert and maintain a distribution over all examples.
Unlike other related methods, our approach handles a general class of loss functions and can be applied to a wide range of noise types and applications.
arXiv Detail & Related papers (2021-04-03T22:54:49Z) - Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels.
We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly.
This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z) - EvidentialMix: Learning with Combined Open-set and Closed-set Noisy
Labels [30.268962418683955]
We study a new variant of the noisy label problem that combines the open-set and closed-set noisy labels.
Our results show that our method produces superior classification results and better feature representations than previous state-of-the-art methods.
arXiv Detail & Related papers (2020-11-11T11:15:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.