Related papers: Perceptron Theory Can Predict the Accuracy of Neural Networks

Perceptron Theory Can Predict the Accuracy of Neural Networks

URL: http://arxiv.org/abs/2012.07881v2
Date: Thu, 20 Jul 2023 16:38:57 GMT
Title: Perceptron Theory Can Predict the Accuracy of Neural Networks
Authors: Denis Kleyko, Antonello Rosato, E. Paxon Frady, Massimo Panella, Friedrich T. Sommer
Abstract summary: Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks.
Score: 6.136302173351179
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Multilayer neural networks set the current state of the art for many technical classification problems. But, these networks are still, essentially, black boxes in terms of analyzing them and predicting their performance. Here, we develop a statistical theory for the one-layer perceptron and show that it can predict performances of a surprisingly large variety of neural networks with different architectures. A general theory of classification with perceptrons is developed by generalizing an existing theory for analyzing reservoir computing models and connectionist models for symbolic reasoning known as vector symbolic architectures. Our statistical theory offers three formulas leveraging the signal statistics with increasing detail. The formulas are analytically intractable, but can be evaluated numerically. The description level that captures maximum details requires stochastic sampling methods. Depending on the network model, the simpler formulas already yield high prediction accuracy. The quality of the theory predictions is assessed in three experimental settings, a memorization task for echo state networks (ESNs) from reservoir computing literature, a collection of classification datasets for shallow randomly connected networks, and the ImageNet dataset for deep convolutional neural networks. We find that the second description level of the perceptron theory can predict the performance of types of ESNs, which could not be described previously. The theory can predict deep multilayer neural networks by being applied to their output layer. While other methods for prediction of neural networks performance commonly require to train an estimator model, the proposed theory requires only the first two moments of the distribution of the postsynaptic sums in the output neurons. The perceptron theory compares favorably to other methods that do not rely on training an estimator model.

Related papers

A Spectral Theory of Neural Prediction and Alignment [8.65717258105897]
We use a recent theoretical framework that relates the generalization error from regression to the spectral properties of the model and the target. We test a large number of deep neural networks that predict visual cortical activity and show that there are multiple types of geometries that result in low neural prediction error as measured via regression.
arXiv Detail & Related papers (2023-09-22T12:24:06Z)
How neural networks learn to classify chaotic time series [77.34726150561087]
We study the inner workings of neural networks trained to classify regular-versus-chaotic time series. We find that the relation between input periodicity and activation periodicity is key for the performance of LKCNN models.
arXiv Detail & Related papers (2023-06-04T08:53:27Z)
Set-based Neural Network Encoding Without Weight Tying [91.37161634310819]
We propose a neural network weight encoding method for network property prediction. Our approach is capable of encoding neural networks in a model zoo of mixed architecture. We introduce two new tasks for neural network property prediction: cross-dataset and cross-architecture.
arXiv Detail & Related papers (2023-05-26T04:34:28Z)
Theoretical analysis of deep neural networks for temporally dependent observations [1.6752182911522522]
We study theoretical properties of deep neural networks on modeling non-linear time series data. Results are supported via various numerical simulation settings as well as an application to a macroeconomic data set.
arXiv Detail & Related papers (2022-10-20T18:56:37Z)
Pyramidal Predictive Network: A Model for Visual-frame Prediction Based on Predictive Coding Theory [1.4610038284393165]
We propose a novel neural network model for the task of visual-frame prediction. The model is composed of a series of recurrent and convolutional units forming the top-down and bottom-up streams. It learns to predict future frames in a visual sequence, with ConvLSTMs on each layer in the network making local prediction from top to down.
arXiv Detail & Related papers (2022-08-15T06:28:34Z)
Neural Capacitance: A New Perspective of Neural Network Selection via Edge Dynamics [85.31710759801705]
Current practice requires expensive computational costs in model training for performance prediction. We propose a novel framework for neural network selection by analyzing the governing dynamics over synaptic connections (edges) during training. Our framework is built on the fact that back-propagation during neural network training is equivalent to the dynamical evolution of synaptic connections.
arXiv Detail & Related papers (2022-01-11T20:53:15Z)
Why Lottery Ticket Wins? A Theoretical Perspective of Sample Complexity on Pruned Neural Networks [79.74580058178594]
We analyze the performance of training a pruned neural network by analyzing the geometric structure of the objective function. We show that the convex region near a desirable model with guaranteed generalization enlarges as the neural network model is pruned.
arXiv Detail & Related papers (2021-10-12T01:11:07Z)
Tensor-Train Networks for Learning Predictive Modeling of Multidimensional Data [0.0]
A promising strategy is based on tensor networks, which have been very successful in physical and chemical applications. We show that the weights of a multidimensional regression model can be learned by means of tensor networks with the aim of performing a powerful compact representation. An algorithm based on alternating least squares has been proposed for approximating the weights in TT-format with a reduction of computational power.
arXiv Detail & Related papers (2021-01-22T16:14:38Z)
The Neural Coding Framework for Learning Generative Models [91.0357317238509]
We propose a novel neural generative model inspired by the theory of predictive processing in the brain. In a similar way, artificial neurons in our generative model predict what neighboring neurons will do, and adjust their parameters based on how well the predictions matched reality.
arXiv Detail & Related papers (2020-12-07T01:20:38Z)
How Neural Networks Extrapolate: From Feedforward to Graph Neural Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution. Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z)
The Gaussian equivalence of generative models for learning with shallow neural networks [30.47878306277163]
We study the performance of neural networks trained on data drawn from pre-trained generative models. We provide three strands of rigorous, analytical and numerical evidence corroborating this equivalence. These results open a viable path to the theoretical study of machine learning models with realistic data.
arXiv Detail & Related papers (2020-06-25T21:20:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.