Related papers: How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?

URL: http://arxiv.org/abs/2106.15475v1
Date: Tue, 29 Jun 2021 14:58:46 GMT
Title: How Does Heterogeneous Label Noise Impact Generalization in Neural Nets?
Authors: Bidur Khanal and Christopher Kanan
Abstract summary: Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets. In the real-world, label noise is often heterogeneous, with some categories being affected to a greater extent than others.
Score: 29.326472933292603
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Incorrectly labeled examples, or label noise, is common in real-world computer vision datasets. While the impact of label noise on learning in deep neural networks has been studied in prior work, these studies have exclusively focused on homogeneous label noise, i.e., the degree of label noise is the same across all categories. However, in the real-world, label noise is often heterogeneous, with some categories being affected to a greater extent than others. Here, we address this gap in the literature. We hypothesized that heterogeneous label noise would only affect the classes that had label noise unless there was transfer from those classes to the classes without label noise. To test this hypothesis, we designed a series of computer vision studies using MNIST, CIFAR-10, CIFAR-100, and MS-COCO where we imposed heterogeneous label noise during the training of multi-class, multi-task, and multi-label systems. Our results provide evidence in support of our hypothesis: label noise only affects the class affected by it unless there is transfer.

Related papers

On the Role of Label Noise in the Feature Learning Process [90.49232384723268]
We consider a signal-noise data distribution, where each sample comprises a label-dependent signal and label-independent noise.<n>Our analysis identifies two key stages. In Stage I, the model perfectly fits all the clean samples while ignoring the noisy ones.<n>In Stage II, the gradient in the direction of noise surpasses that of the signal, leading to overfitting on noisy samples.
arXiv Detail & Related papers (2025-05-25T00:13:28Z)
AlleNoise: large-scale text classification benchmark dataset with real-world label noise [40.11095094521714]
We present AlleNoise, a new curated text classification benchmark dataset with real-world instance-dependent label noise. The noise distribution comes from actual users of a major e-commerce marketplace, so it realistically reflects the semantics of human mistakes. We demonstrate that a representative selection of established methods for learning with noisy labels is inadequate to handle such real-world noise.
arXiv Detail & Related papers (2024-06-24T09:29:14Z)
Handling Realistic Label Noise in BERT Text Classification [1.0515439489916731]
Real label noise is not random; rather, it is often correlated with input features or other annotator-specific factors. We show that the presence of these types of noise significantly degrades BERT classification performance.
arXiv Detail & Related papers (2023-05-23T18:30:31Z)
Robustness to Label Noise Depends on the Shape of the Noise Distribution in Feature Space [6.748225062396441]
We show that both the scale and the shape of the noise distribution influence the posterior likelihood. We show that when the noise distribution targets decision boundaries, classification robustness can drop off even at a small scale of noise.
arXiv Detail & Related papers (2022-06-02T15:41:59Z)
Learning with Noisy Labels Revisited: A Study Using Real-World Human Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise. This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N) We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z)
Instance-dependent Label-noise Learning under a Structural Causal Model [92.76400590283448]
Label noise will degenerate the performance of deep learning algorithms. By leveraging a structural causal model, we propose a novel generative approach for instance-dependent label-noise learning.
arXiv Detail & Related papers (2021-09-07T10:42:54Z)
Learning with Feature-Dependent Label Noise: A Progressive Approach [19.425199841491246]
We propose a new family of feature-dependent label noise, which is much more general than commonly used i.i.d. label noise. We provide theoretical guarantees showing that for a wide variety of (unknown) noise patterns, a classifier trained with this strategy converges to be consistent with the Bayes classifier.
arXiv Detail & Related papers (2021-03-13T17:34:22Z)
A Second-Order Approach to Learning with Instance-Dependent Label Noise [58.555527517928596]
The presence of label noise often misleads the training of deep neural networks. We show that the errors in human-annotated labels are more likely to be dependent on the difficulty levels of tasks.
arXiv Detail & Related papers (2020-12-22T06:36:58Z)
Part-dependent Label Noise: Towards Instance-dependent Label Noise [194.73829226122731]
Learning with the textitinstance-dependent label noise is challenging, because it is hard to model such real-world noise. In this paper, we approximate the instance-dependent label noise by exploiting textitpart-dependent label noise. Empirical evaluations on synthetic and real-world datasets demonstrate our method is superior to the state-of-the-art approaches.
arXiv Detail & Related papers (2020-06-14T08:12:10Z)
Class2Simi: A Noise Reduction Perspective on Learning with Noisy Labels [98.13491369929798]
We propose a framework called Class2Simi, which transforms data points with noisy class labels to data pairs with noisy similarity labels. Class2Simi is computationally efficient because not only this transformation is on-the-fly in mini-batches, but also it just changes loss on top of model prediction into a pairwise manner.
arXiv Detail & Related papers (2020-06-14T07:55:32Z)
Multi-Class Classification from Noisy-Similarity-Labeled Data [98.13491369929798]
We propose a method for learning from only noisy-similarity-labeled data. We use a noise transition matrix to bridge the class-posterior probability between clean and noisy data. We build a novel learning system which can assign noise-free class labels for instances.
arXiv Detail & Related papers (2020-02-16T05:10:21Z)

This list is automatically generated from the titles and abstracts of the papers in this site.