Identifying Training Stop Point with Noisy Labeled Data
- URL: http://arxiv.org/abs/2012.13435v1
- Date: Thu, 24 Dec 2020 20:07:30 GMT
- Title: Identifying Training Stop Point with Noisy Labeled Data
- Authors: Sree Ram Kamabattula, Venkat Devarajan, Babak Namazi, Ganesh
Sankaranarayanan
- Abstract summary: We develop an algorithm to find a training stop point (TSP) at or close to test accuracy (MOTA)
We validated the robustness of our algorithm (AutoTSP) through several experiments on CIFAR-10, CIFAR-100, and a real-world noisy dataset.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep neural networks (DNNs) with noisy labels is a challenging
problem due to over-parameterization. DNNs tend to essentially fit on clean
samples at a higher rate in the initial stages, and later fit on the noisy
samples at a relatively lower rate. Thus, with a noisy dataset, the test
accuracy increases initially and drops in the later stages. To find an early
stopping point at the maximum obtainable test accuracy (MOTA), recent studies
assume either that i) a clean validation set is available or ii) the noise
ratio is known, or, both. However, often a clean validation set is unavailable,
and the noise estimation can be inaccurate. To overcome these issues, we
provide a novel training solution, free of these conditions. We analyze the
rate of change of the training accuracy for different noise ratios under
different conditions to identify a training stop region. We further develop a
heuristic algorithm based on a small-learning assumption to find a training
stop point (TSP) at or close to MOTA. To the best of our knowledge, our method
is the first to rely solely on the \textit{training behavior}, while utilizing
the entire training set, to automatically find a TSP. We validated the
robustness of our algorithm (AutoTSP) through several experiments on CIFAR-10,
CIFAR-100, and a real-world noisy dataset for different noise ratios, noise
types and architectures.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.