Abstract: This paper aims to provide understandings for the effect of an
over-parameterized model, e.g. a deep neural network, memorizing
instance-dependent noisy labels. We first quantify the harms caused by
memorizing noisy instances from different spectra of the sample distribution.
We then analyze how several popular solutions for learning with noisy labels
mitigate this harm at the instance-level. Our analysis reveals new
understandings for when these approaches work, and provides theoretical
justifications for previously reported empirical observations. A key aspect of
the analysis is its focus on each training instance.