Related papers: The Effect of Label Noise on the Information Content of Neural Representations

The Effect of Label Noise on the Information Content of Neural Representations

URL: http://arxiv.org/abs/2510.06401v1
Date: Tue, 07 Oct 2025 19:27:26 GMT
Title: The Effect of Label Noise on the Information Content of Neural Representations
Authors: Ali Hussaini Umar, Franky Kevin Nando Tezoh, Jean Barbier, Santiago Acevedo, Alessandro Laio,
Abstract summary: In supervised classification tasks, models are trained to predict a label for each data point.<n>In real-world datasets, these labels are often noisy due to annotation errors.<n>We show that the impact of label noise on the performance of deep learning models remains poorly understood.
Score: 40.16192460363009
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In supervised classification tasks, models are trained to predict a label for each data point. In real-world datasets, these labels are often noisy due to annotation errors. While the impact of label noise on the performance of deep learning models has been widely studied, its effects on the networks' hidden representations remain poorly understood. We address this gap by systematically comparing hidden representations using the Information Imbalance, a computationally efficient proxy of conditional mutual information. Through this analysis, we observe that the information content of the hidden representations follows a double descent as a function of the number of network parameters, akin to the behavior of the test error. We further demonstrate that in the underparameterized regime, representations learned with noisy labels are more informative than those learned with clean labels, while in the overparameterized regime, these representations are equally informative. Our results indicate that the representations of overparameterized networks are robust to label noise. We also found that the information imbalance between the penultimate and pre-softmax layers decreases with cross-entropy loss in the overparameterized regime. This offers a new perspective on understanding generalization in classification tasks. Extending our analysis to representations learned from random labels, we show that these perform worse than random features. This indicates that training on random labels drives networks much beyond lazy learning, as weights adapt to encode labels information.

Related papers

ERASE: Error-Resilient Representation Learning on Graphs for Label Noise Tolerance [53.73316938815873]
We propose a method called ERASE (Error-Resilient representation learning on graphs for lAbel noiSe tolerancE) to learn representations with error tolerance. ERASE combines prototype pseudo-labels with propagated denoised labels and updates representations with error resilience. Our method can outperform multiple baselines with clear margins in broad noise levels and enjoy great scalability.
arXiv Detail & Related papers (2023-12-13T17:59:07Z)
Losses over Labels: Weakly Supervised Learning via Direct Loss Construction [71.11337906077483]
Programmable weak supervision is a growing paradigm within machine learning. We propose Losses over Labels (LoL) as it creates losses directly from ofs without going through the intermediate step of a label. We show that LoL improves upon existing weak supervision methods on several benchmark text and image classification tasks.
arXiv Detail & Related papers (2022-12-13T22:29:14Z)
Learning to Aggregate and Refine Noisy Labels for Visual Sentiment Analysis [69.48582264712854]
We propose a robust learning method to perform robust visual sentiment analysis. Our method relies on an external memory to aggregate and filter noisy labels during training. We establish a benchmark for visual sentiment analysis with label noise using publicly available datasets.
arXiv Detail & Related papers (2021-09-15T18:18:28Z)
Noisy Labels Can Induce Good Representations [53.47668632785373]
We study how architecture affects learning with noisy labels. We show that training with noisy labels can induce useful hidden representations, even when the model generalizes poorly. This finding leads to a simple method to improve models trained on noisy labels.
arXiv Detail & Related papers (2020-12-23T18:58:05Z)
Combining Self-Supervised and Supervised Learning with Noisy Labels [41.627404715407586]
convolutional neural networks (CNNs) can easily overfit noisy labels. It has been a great challenge to train CNNs against them robustly.
arXiv Detail & Related papers (2020-11-16T18:13:41Z)
Exploiting Context for Robustness to Label Noise in Active Learning [47.341705184013804]
We address the problems of how a system can identify which of the queried labels are wrong and how a multi-class active learning system can be adapted to minimize the negative impact of label noise. We construct a graphical representation of the unlabeled data to encode these relationships and obtain new beliefs on the graph when noisy labels are available. This is demonstrated in three different applications: scene classification, activity classification, and document classification.
arXiv Detail & Related papers (2020-10-18T18:59:44Z)
Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision. The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr. This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z)
Improving Generalization by Controlling Label-Noise Information in Neural Network Weights [33.85101318266319]
In the presence of noisy or incorrect labels, neural networks have the undesirable tendency to memorize information about the noise. Standard regularization techniques such as dropout, weight decay or data augmentation sometimes help, but do not prevent this behavior. We show that for any training algorithm, low values of this term correspond to reduction in memorization of label-noise and better bounds.
arXiv Detail & Related papers (2020-02-19T00:08:30Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.