How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?
- URL: http://arxiv.org/abs/2106.15475v1
- Date: Tue, 29 Jun 2021 14:58:46 GMT
- Title: How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?
- Authors: Bidur Khanal and Christopher Kanan
- Abstract summary: Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets.
In the real-world, label noise is often heterogeneous, with some categories being affected to a greater extent than others.
- Score: 29.326472933292603
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Incorrectly labeled examples, or label noise, is common in real-world
computer vision datasets. While the impact of label noise on learning in deep
neural networks has been studied in prior work, these studies have exclusively
focused on homogeneous label noise, i.e., the degree of label noise is the same
across all categories. However, in the real-world, label noise is often
heterogeneous, with some categories being affected to a greater extent than
others. Here, we address this gap in the literature. We hypothesized that
heterogeneous label noise would only affect the classes that had label noise
unless there was transfer from those classes to the classes without label
noise. To test this hypothesis, we designed a series of computer vision studies
using MNIST, CIFAR-10, CIFAR-100, and MS-COCO where we imposed heterogeneous
label noise during the training of multi-class, multi-task, and multi-label
systems. Our results provide evidence in support of our hypothesis: label noise
only affects the class affected by it unless there is transfer.
Related papers
- AlleNoise: large-scale text classification benchmark dataset with real-world label noise [40.11095094521714]
We present AlleNoise, a new curated text classification benchmark dataset with real-world instance-dependent label noise.
The noise distribution comes from actual users of a major e-commerce marketplace, so it realistically reflects the semantics of human mistakes.
We demonstrate that a representative selection of established methods for learning with noisy labels is inadequate to handle such real-world noise.
arXiv Detail & Related papers (2024-06-24T09:29:14Z) - Handling Realistic Label Noise in BERT Text Classification [1.0515439489916731]
Real label noise is not random; rather, it is often correlated with input features or other annotator-specific factors.
We show that the presence of these types of noise significantly degrades BERT classification performance.
arXiv Detail & Related papers (2023-05-23T18:30:31Z) - Robustness to Label Noise Depends on the Shape of the Noise Distribution
in Feature Space [6.748225062396441]
We show that both the scale and the shape of the noise distribution influence the posterior likelihood.
We show that when the noise distribution targets decision boundaries, classification robustness can drop off even at a small scale of noise.
arXiv Detail & Related papers (2022-06-02T15:41:59Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - Instance-dependent Label-noise Learning under a Structural Causal Model [92.76400590283448]
Label noise will degenerate the performance of deep learning algorithms.
By leveraging a structural causal model, we propose a novel generative approach for instance-dependent label-noise learning.
arXiv Detail & Related papers (2021-09-07T10:42:54Z) - Learning with Feature-Dependent Label Noise: A Progressive Approach [19.425199841491246]
We propose a new family of feature-dependent label noise, which is much more general than commonly used i.i.d. label noise.
We provide theoretical guarantees showing that for a wide variety of (unknown) noise patterns, a classifier trained with this strategy converges to be consistent with the Bayes classifier.
arXiv Detail & Related papers (2021-03-13T17:34:22Z) - A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks.
We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z) - Part-dependent Label Noise: Towards Instance-dependent Label Noise [194.73829226122731]
Learning with the textitinstance-dependent label noise is challenging, because it is hard to model such real-world noise.
In this paper, we approximate the instance-dependent label noise by exploiting textitpart-dependent label noise.
Empirical evaluations on synthetic and real-world datasets demonstrate our method is superior to the state-of-the-art approaches.
arXiv Detail & Related papers (2020-06-14T08:12:10Z) - Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels.
Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z) - Multi-Class Classification from Noisy-Similarity-Labeled Data [98.13491369929798]
We propose a method for learning from only noisy-similarity-labeled data.
We use a noise transition matrix to bridge the class-posterior probability between clean and noisy data.
We build a novel learning system which can assign noise-free class labels for instances.
arXiv Detail & Related papers (2020-02-16T05:10:21Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.