Improving Generalization by Controlling Label-Noise Information in
Neural Network Weights
- URL: http://arxiv.org/abs/2002.07933v2
- Date: Fri, 20 Nov 2020 09:41:41 GMT
- Title: Improving Generalization by Controlling Label-Noise Information in
Neural Network Weights
- Authors: Hrayr Harutyunyan, Kyle Reing, Greg Ver Steeg, Aram Galstyan
- Abstract summary: In the presence of noisy or incorrect labels, neural networks have the undesirable tendency to memorize information about the noise.
Standard regularization techniques such as dropout, weight decay or data augmentation sometimes help, but do not prevent this behavior.
We show that for any training algorithm, low values of this term correspond to reduction in memorization of label-noise and better bounds.
- Score: 33.85101318266319
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the presence of noisy or incorrect labels, neural networks have the
undesirable tendency to memorize information about the noise. Standard
regularization techniques such as dropout, weight decay or data augmentation
sometimes help, but do not prevent this behavior. If one considers neural
network weights as random variables that depend on the data and stochasticity
of training, the amount of memorized information can be quantified with the
Shannon mutual information between weights and the vector of all training
labels given inputs, $I(w ; \mathbf{y} \mid \mathbf{x})$. We show that for any
training algorithm, low values of this term correspond to reduction in
memorization of label-noise and better generalization bounds. To obtain these
low values, we propose training algorithms that employ an auxiliary network
that predicts gradients in the final layers of a classifier without accessing
labels. We illustrate the effectiveness of our approach on versions of MNIST,
CIFAR-10, and CIFAR-100 corrupted with various noise models, and on a
large-scale dataset Clothing1M that has noisy labels.
Related papers
- Blind Knowledge Distillation for Robust Image Classification [19.668440671541546]
Blind Knowledge Distillation is a teacher-student approach for learning with noisy labels.
We use Otsus algorithm to estimate the tipping point from generalizing to overfitting.
We show in our experiments that Blind Knowledge Distillation detects overfitting effectively during training.
arXiv Detail & Related papers (2022-11-21T11:17:07Z) - Learning advisor networks for noisy image classification [22.77447144331876]
We introduce the novel concept of advisor network to address the problem of noisy labels in image classification.
We trained it with a meta-learning strategy so that it can adapt throughout the training of the main model.
We tested our method on CIFAR10 and CIFAR100 with synthetic noise, and on Clothing1M which contains real-world noise, reporting state-of-the-art results.
arXiv Detail & Related papers (2022-11-08T11:44:08Z) - Context-based Virtual Adversarial Training for Text Classification with
Noisy Labels [1.9508698179748525]
We propose context-based virtual adversarial training (ConVAT) to prevent a text classifier from overfitting to noisy labels.
Unlike the previous works, the proposed method performs the adversarial training at the context level rather than the inputs.
We conduct extensive experiments on four text classification datasets with two types of label noises.
arXiv Detail & Related papers (2022-05-29T14:19:49Z) - Synergistic Network Learning and Label Correction for Noise-robust Image
Classification [28.27739181560233]
Deep Neural Networks (DNNs) tend to overfit training label noise, resulting in poorer model performance in practice.
We propose a robust label correction framework combining the ideas of small loss selection and noise correction.
We demonstrate our method on both synthetic and real-world datasets with different noise types and rates.
arXiv Detail & Related papers (2022-02-27T23:06:31Z) - Learning with Neighbor Consistency for Noisy Labels [69.83857578836769]
We present a method for learning from noisy labels that leverages similarities between training examples in feature space.
We evaluate our method on datasets evaluating both synthetic (CIFAR-10, CIFAR-100) and realistic (mini-WebVision, Clothing1M, mini-ImageNet-Red) noise.
arXiv Detail & Related papers (2022-02-04T15:46:27Z) - Prototypical Classifier for Robust Class-Imbalanced Learning [64.96088324684683]
We propose textitPrototypical, which does not require fitting additional parameters given the embedding network.
Prototypical produces balanced and comparable predictions for all classes even though the training set is class-imbalanced.
We test our method on CIFAR-10LT, CIFAR-100LT and Webvision datasets, observing that Prototypical obtains substaintial improvements compared with state of the arts.
arXiv Detail & Related papers (2021-10-22T01:55:01Z) - Tackling Instance-Dependent Label Noise via a Universal Probabilistic
Model [80.91927573604438]
This paper proposes a simple yet universal probabilistic model, which explicitly relates noisy labels to their instances.
Experiments on datasets with both synthetic and real-world label noise verify that the proposed method yields significant improvements on robustness.
arXiv Detail & Related papers (2021-01-14T05:43:51Z) - Attention-Aware Noisy Label Learning for Image Classification [97.26664962498887]
Deep convolutional neural networks (CNNs) learned on large-scale labeled samples have achieved remarkable progress in computer vision.
The cheapest way to obtain a large body of labeled visual data is to crawl from websites with user-supplied labels, such as Flickr.
This paper proposes the attention-aware noisy label learning approach to improve the discriminative capability of the network trained on datasets with potential label noise.
arXiv Detail & Related papers (2020-09-30T15:45:36Z) - Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets.
However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality.
We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z) - Learning from Noisy Labels with Noise Modeling Network [7.523041606515877]
Noise Modeling Network (NMN) follows our convolutional neural network (CNN) and integrates with it.
NMN learns the distribution of noise patterns directly from the noisy data.
We show that the integrated NMN/CNN learning system consistently improves the classification performance.
arXiv Detail & Related papers (2020-05-01T20:32:22Z) - Learning with Out-of-Distribution Data for Audio Classification [60.48251022280506]
We show that detecting and relabelling certain OOD instances, rather than discarding them, can have a positive effect on learning.
The proposed method is shown to improve the performance of convolutional neural networks by a significant margin.
arXiv Detail & Related papers (2020-02-11T21:08:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.