Memorization with neural nets: going beyond the worst case
- URL: http://arxiv.org/abs/2310.00327v2
- Date: Thu, 12 Oct 2023 11:55:28 GMT
- Title: Memorization with neural nets: going beyond the worst case
- Authors: Sjoerd Dirksen and Patrick Finke and Martin Genzel
- Abstract summary: In practice, deep neural networks are often able to easily interpolate their training data.
For real-world data, however, one intuitively expects the presence of a benign structure so that already occurs at a smaller network size than suggested by memorization capacity.
We introduce a simple randomized algorithm that, given a fixed finite dataset with two classes, with high probability constructs an interpolating three-layer neural network in time.
- Score: 5.662924503089369
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In practice, deep neural networks are often able to easily interpolate their
training data. To understand this phenomenon, many works have aimed to quantify
the memorization capacity of a neural network architecture: the largest number
of points such that the architecture can interpolate any placement of these
points with any assignment of labels. For real-world data, however, one
intuitively expects the presence of a benign structure so that interpolation
already occurs at a smaller network size than suggested by memorization
capacity. In this paper, we investigate interpolation by adopting an
instance-specific viewpoint. We introduce a simple randomized algorithm that,
given a fixed finite dataset with two classes, with high probability constructs
an interpolating three-layer neural network in polynomial time. The required
number of parameters is linked to geometric properties of the two classes and
their mutual arrangement. As a result, we obtain guarantees that are
independent of the number of samples and hence move beyond worst-case
memorization capacity bounds. We illustrate the effectiveness of the algorithm
in non-pathological situations with extensive numerical experiments and link
the insights back to the theoretical results.
Related papers
- On Characterizing the Evolution of Embedding Space of Neural Networks
using Algebraic Topology [9.537910170141467]
We study how the topology of feature embedding space changes as it passes through the layers of a well-trained deep neural network (DNN) through Betti numbers.
We demonstrate that as depth increases, a topologically complicated dataset is transformed into a simple one, resulting in Betti numbers attaining their lowest possible value.
arXiv Detail & Related papers (2023-11-08T10:45:12Z) - Multilayer Multiset Neuronal Networks -- MMNNs [55.2480439325792]
The present work describes multilayer multiset neuronal networks incorporating two or more layers of coincidence similarity neurons.
The work also explores the utilization of counter-prototype points, which are assigned to the image regions to be avoided.
arXiv Detail & Related papers (2023-08-28T12:55:13Z) - Quasi-orthogonality and intrinsic dimensions as measures of learning and
generalisation [55.80128181112308]
We show that dimensionality and quasi-orthogonality of neural networks' feature space may jointly serve as network's performance discriminants.
Our findings suggest important relationships between the networks' final performance and properties of their randomly initialised feature spaces.
arXiv Detail & Related papers (2022-03-30T21:47:32Z) - Dive into Layers: Neural Network Capacity Bounding using Algebraic
Geometry [55.57953219617467]
We show that the learnability of a neural network is directly related to its size.
We use Betti numbers to measure the topological geometric complexity of input data and the neural network.
We perform the experiments on a real-world dataset MNIST and the results verify our analysis and conclusion.
arXiv Detail & Related papers (2021-09-03T11:45:51Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Bayesian reconstruction of memories stored in neural networks from their
connectivity [25.94639282590696]
We provide a practical algorithm for reconstructing stored patterns from synaptic connectivity.
We study its performance on three different models and explore the limitations of reconstructing stored patterns from synaptic connectivity.
arXiv Detail & Related papers (2021-05-16T12:05:10Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Learning Connectivity of Neural Networks from a Topological Perspective [80.35103711638548]
We propose a topological perspective to represent a network into a complete graph for analysis.
By assigning learnable parameters to the edges which reflect the magnitude of connections, the learning process can be performed in a differentiable manner.
This learning process is compatible with existing networks and owns adaptability to larger search spaces and different tasks.
arXiv Detail & Related papers (2020-08-19T04:53:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.