Abstract: In the application of neural networks, we need to select a suitable model
based on the problem complexity and the dataset scale. To analyze the network's
capacity, quantifying the information learned by the network is necessary. This
paper proves that the distance between the neural network weights in different
training stages can be used to estimate the information accumulated by the
network in the training process directly. The experiment results verify the
utility of this method. An application of this method related to the label
corruption is shown at the end.