Abstract: We study the generalization of deep learning models in relation to the convex
hull of their training sets. A trained image classifier basically partitions
its domain via decision boundaries and assigns a class to each of those
partitions. The location of decision boundaries inside the convex hull of
training set can be investigated in relation to the training samples. However,
our analysis shows that in standard image classification datasets, all testing
images are considerably outside that convex hull, in the pixel space, in the
wavelet space, and in the internal representations learned by deep networks.
Therefore, the performance of a trained model partially depends on how its
decision boundaries are extended outside the convex hull of its training data.
From this perspective which is not studied before, over-parameterization of
deep learning models may be considered a necessity for shaping the extension of
decision boundaries. At the same time, over-parameterization should be
accompanied by a specific training regime, in order to yield a model that not
only fits the training set, but also its decision boundaries extend desirably
outside the convex hull. To illustrate this, we investigate the decision
boundaries of a neural network, with various degrees of parameters, inside and
outside the convex hull of its training set. Moreover, we use a polynomial
decision boundary to study the necessity of over-parameterization and the
influence of training regime in shaping its extensions outside the convex hull
of training set.