A Theoretical Analysis of Learning with Noisily Labeled Data
- URL: http://arxiv.org/abs/2104.04114v1
- Date: Thu, 8 Apr 2021 23:40:02 GMT
- Title: A Theoretical Analysis of Learning with Noisily Labeled Data
- Authors: Yi Xu, Qi Qian, Hao Li, Rong Jin
- Abstract summary: We first show that in the first epoch training, the examples with clean labels will be learned first.
We then show that after the learning from clean data stage, continuously training model can achieve further improvement in testing error.
- Score: 62.946840431501855
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Noisy labels are very common in deep supervised learning. Although many
studies tend to improve the robustness of deep training for noisy labels, rare
works focus on theoretically explaining the training behaviors of learning with
noisily labeled data, which is a fundamental principle in understanding its
generalization. In this draft, we study its two phenomena, clean data first and
phase transition, by explaining them from a theoretical viewpoint.
Specifically, we first show that in the first epoch training, the examples with
clean labels will be learned first. We then show that after the learning from
clean data stage, continuously training model can achieve further improvement
in testing error when the rate of corrupted class labels is smaller than a
certain threshold; otherwise, extensively training could lead to an increasing
testing error.
Related papers
- Why Fine-grained Labels in Pretraining Benefit Generalization? [12.171634061370616]
Recent studies show that pretraining a deep neural network with fine-grained labeled data, followed by fine-tuning on coarse-labeled data, often yields better generalization than pretraining with coarse-labeled data.
This paper addresses this gap by introducing a "hierarchical multi-view" structure to confine the input data distribution.
Under this framework, we prove that: 1) coarse-grained pretraining only allows a neural network to learn the common features well, while 2) fine-grained pretraining helps the network learn the rare features in addition to the common ones, leading to improved accuracy on hard downstream test samples.
arXiv Detail & Related papers (2024-10-30T15:41:30Z) - Late Stopping: Avoiding Confidently Learning from Mislabeled Examples [61.00103151680946]
We propose a new framework, Late Stopping, which leverages the intrinsic robust learning ability of DNNs through a prolonged training process.
We empirically observe that mislabeled and clean examples exhibit differences in the number of epochs required for them to be consistently and correctly classified.
Experimental results on benchmark-simulated and real-world noisy datasets demonstrate that the proposed method outperforms state-of-the-art counterparts.
arXiv Detail & Related papers (2023-08-26T12:43:25Z) - Robust Positive-Unlabeled Learning via Noise Negative Sample
Self-correction [48.929877651182885]
Learning from positive and unlabeled data is known as positive-unlabeled (PU) learning in literature.
We propose a new robust PU learning method with a training strategy motivated by the nature of human learning.
arXiv Detail & Related papers (2023-08-01T04:34:52Z) - On Emergence of Clean-Priority Learning in Early Stopped Neural Networks [18.725557157004214]
When random label noise is added to a training dataset, the prediction error of a neural network on a label-noise-free test dataset deteriorates.
This behaviour is believed to be a result of neural networks learning the pattern of clean data first and fitting the noise later in the training.
We show both theoretically and experimentally, as the clean-priority learning goes on, the dominance of the gradients of clean samples over those of noisy samples diminishes.
arXiv Detail & Related papers (2023-06-05T01:45:22Z) - A Theoretical Understanding of Shallow Vision Transformers: Learning,
Generalization, and Sample Complexity [71.11795737362459]
ViTs with self-attention modules have recently achieved great empirical success in many tasks.
However, theoretical learning generalization analysis is mostly noisy and elusive.
This paper provides the first theoretical analysis of a shallow ViT for a classification task.
arXiv Detail & Related papers (2023-02-12T22:12:35Z) - Debiased Pseudo Labeling in Self-Training [77.83549261035277]
Deep neural networks achieve remarkable performances on a wide range of tasks with the aid of large-scale labeled datasets.
To mitigate the requirement for labeled data, self-training is widely used in both academia and industry by pseudo labeling on readily-available unlabeled data.
We propose Debiased, in which the generation and utilization of pseudo labels are decoupled by two independent heads.
arXiv Detail & Related papers (2022-02-15T02:14:33Z) - Robust Long-Tailed Learning under Label Noise [50.00837134041317]
This work investigates the label noise problem under long-tailed label distribution.
We propose a robust framework,algo, that realizes noise detection for long-tailed learning.
Our framework can naturally leverage semi-supervised learning algorithms to further improve the generalisation.
arXiv Detail & Related papers (2021-08-26T03:45:00Z) - Early-Learning Regularization Prevents Memorization of Noisy Labels [29.04549895470588]
We propose a novel framework to perform classification via deep learning in the presence of noisy annotations.
Deep neural networks have been observed to first fit the training data with clean labels during an "early learning" phase.
We design a regularization term that steers the model towards these targets, implicitly preventing memorization of the false labels.
arXiv Detail & Related papers (2020-06-30T23:46:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.