Related papers: Understanding and Improving Early Stopping for Learning with Noisy Labels

Understanding and Improving Early Stopping for Learning with Noisy Labels

URL: http://arxiv.org/abs/2106.15853v1
Date: Wed, 30 Jun 2021 07:18:00 GMT
Title: Understanding and Improving Early Stopping for Learning with Noisy Labels
Authors: Yingbin Bai, Erkun Yang, Bo Han, Yanhua Yang, Jiatong Li, Yinian Mao, Gang Niu, Tongliang Liu
Abstract summary: The memorization effect of deep neural network (DNN) plays a pivotal role in many state-of-the-art label-noise learning methods. Current methods generally decide the early stopping point by considering a DNN as a whole. We propose to separate a DNN into different parts and progressively train them to address this problem.
Score: 63.0730063791198
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: The memorization effect of deep neural network (DNN) plays a pivotal role in many state-of-the-art label-noise learning methods. To exploit this property, the early stopping trick, which stops the optimization at the early stage of training, is usually adopted. Current methods generally decide the early stopping point by considering a DNN as a whole. However, a DNN can be considered as a composition of a series of layers, and we find that the latter layers in a DNN are much more sensitive to label noise, while their former counterparts are quite robust. Therefore, selecting a stopping point for the whole network may make different DNN layers antagonistically affected each other, thus degrading the final performance. In this paper, we propose to separate a DNN into different parts and progressively train them to address this problem. Instead of the early stopping, which trains a whole DNN all at once, we initially train former DNN layers by optimizing the DNN with a relatively large number of epochs. During training, we progressively train the latter DNN layers by using a smaller number of epochs with the preceding layers fixed to counteract the impact of noisy labels. We term the proposed method as progressive early stopping (PES). Despite its simplicity, compared with the early stopping, PES can help to obtain more promising and stable results. Furthermore, by combining PES with existing approaches on noisy label training, we achieve state-of-the-art performance on image classification benchmarks.

Related papers

Early-Exit with Class Exclusion for Efficient Inference of Neural Networks [4.180653524441411]
We propose a class-based early-exit for dynamic inference in deep neural networks (DNNs) We take advantage of the learned features in these layers to exclude as many irrelevant classes as possible. Experimental results demonstrate the computational cost of DNNs in inference can be reduced significantly.
arXiv Detail & Related papers (2023-09-23T18:12:27Z)
Dynamics-Aware Loss for Learning with Label Noise [73.75129479936302]
Label noise poses a serious threat to deep neural networks (DNNs) We propose a dynamics-aware loss (DAL) to solve this problem. Both the detailed theoretical analyses and extensive experimental results demonstrate the superiority of our method.
arXiv Detail & Related papers (2023-03-21T03:05:21Z)
Rethinking Nearest Neighbors for Visual Classification [56.00783095670361]
k-NN is a lazy learning method that aggregates the distance between the test image and top-k neighbors in a training set. We adopt k-NN with pre-trained visual representations produced by either supervised or self-supervised methods in two steps. Via extensive experiments on a wide range of classification tasks, our study reveals the generality and flexibility of k-NN integration.
arXiv Detail & Related papers (2021-12-15T20:15:01Z)
Spatial-Temporal-Fusion BNN: Variational Bayesian Feature Layer [77.78479877473899]
We design a spatial-temporal-fusion BNN for efficiently scaling BNNs to large models. Compared to vanilla BNNs, our approach can greatly reduce the training time and the number of parameters, which contributes to scale BNNs efficiently.
arXiv Detail & Related papers (2021-12-12T17:13:14Z)
Fast Axiomatic Attribution for Neural Networks [44.527672563424545]
Recent approaches include priors on the feature attribution of a deep neural network (DNN) into the training process to reduce the dependence on unwanted features. We consider a special class of efficiently axiomatically attributable DNNs for which an axiomatic feature attribution can be computed with only a single forward/backward pass. Various experiments demonstrate the advantages of $mathcalX$-DNNs, beating state-of-the-art generic attribution methods on regular DNNs for training with attribution priors.
arXiv Detail & Related papers (2021-11-15T10:51:01Z)
Deep Time Delay Neural Network for Speech Enhancement with Full Data Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning. To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z)
Going Deeper With Directly-Trained Larger Spiking Neural Networks [20.40894876501739]
Spiking neural networks (SNNs) are promising in coding for bio-usible information and event-driven signal processing. However, the unique working mode of SNNs makes them more difficult to train than traditional networks. We propose a CIF-dependent batch normalization (tpladBN) method based on the emerging-temporal backproation threshold.
arXiv Detail & Related papers (2020-10-29T07:15:52Z)
Temporal Calibrated Regularization for Robust Noisy Label Learning [60.90967240168525]
Deep neural networks (DNNs) exhibit great success on many tasks with the help of large-scale well annotated datasets. However, labeling large-scale data can be very costly and error-prone so that it is difficult to guarantee the annotation quality. We propose a Temporal Calibrated Regularization (TCR) in which we utilize the original labels and the predictions in the previous epoch together.
arXiv Detail & Related papers (2020-07-01T04:48:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.