The training accuracy of two-layer neural networks: its estimation and
understanding using random datasets
- URL: http://arxiv.org/abs/2010.13380v2
- Date: Thu, 9 Nov 2023 05:16:03 GMT
- Title: The training accuracy of two-layer neural networks: its estimation and
understanding using random datasets
- Authors: Shuyue Guan, Murray Loew
- Abstract summary: We propose a novel theory based on space partitioning to estimate the approximate training accuracy for two-layer neural networks on random datasets without training.
Our method estimates the training accuracy for two-layer fully-connected neural networks on two-class random datasets using only three arguments.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although the neural network (NN) technique plays an important role in machine
learning, understanding the mechanism of NN models and the transparency of deep
learning still require more basic research. In this study, we propose a novel
theory based on space partitioning to estimate the approximate training
accuracy for two-layer neural networks on random datasets without training.
There appear to be no other studies that have proposed a method to estimate
training accuracy without using input data and/or trained models. Our method
estimates the training accuracy for two-layer fully-connected neural networks
on two-class random datasets using only three arguments: the dimensionality of
inputs (d), the number of inputs (N), and the number of neurons in the hidden
layer (L). We have verified our method using real training accuracies in our
experiments. The results indicate that the method will work for any dimension,
and the proposed theory could extend also to estimate deeper NN models. The
main purpose of this paper is to understand the mechanism of NN models by the
approach of estimating training accuracy but not to analyze their
generalization nor their performance in real-world applications. This study may
provide a starting point for a new way for researchers to make progress on the
difficult problem of understanding deep learning.
Related papers
- Fundamental limits of overparametrized shallow neural networks for
supervised learning [11.136777922498355]
We study a two-layer neural network trained from input-output pairs generated by a teacher network with matching architecture.
Our results come in the form of bounds relating i) the mutual information between training data and network weights, or ii) the Bayes-optimal generalization error.
arXiv Detail & Related papers (2023-07-11T08:30:50Z) - Reliable extrapolation of deep neural operators informed by physics or
sparse observations [2.887258133992338]
Deep neural operators can learn nonlinear mappings between infinite-dimensional function spaces via deep neural networks.
DeepONets provide a new simulation paradigm in science and engineering.
We propose five reliable learning methods that guarantee a safe prediction under extrapolation.
arXiv Detail & Related papers (2022-12-13T03:02:46Z) - Linear Leaky-Integrate-and-Fire Neuron Model Based Spiking Neural
Networks and Its Mapping Relationship to Deep Neural Networks [7.840247953745616]
Spiking neural networks (SNNs) are brain-inspired machine learning algorithms with merits such as biological plausibility and unsupervised learning capability.
This paper establishes a precise mathematical mapping between the biological parameters of the Linear Leaky-Integrate-and-Fire model (LIF)/SNNs and the parameters of ReLU-AN/Deep Neural Networks (DNNs)
arXiv Detail & Related papers (2022-05-31T17:02:26Z) - Network Gradient Descent Algorithm for Decentralized Federated Learning [0.2867517731896504]
We study a fully decentralized federated learning algorithm, which is a novel descent gradient algorithm executed on a communication-based network.
In the NGD method, only statistics (e.g., parameter estimates) need to be communicated, minimizing the risk of privacy.
We find that both the learning rate and the network structure play significant roles in determining the NGD estimator's statistical efficiency.
arXiv Detail & Related papers (2022-05-06T02:53:31Z) - Data-driven emergence of convolutional structure in neural networks [83.4920717252233]
We show how fully-connected neural networks solving a discrimination task can learn a convolutional structure directly from their inputs.
By carefully designing data models, we show that the emergence of this pattern is triggered by the non-Gaussian, higher-order local structure of the inputs.
arXiv Detail & Related papers (2022-02-01T17:11:13Z) - Dynamic Neural Diversification: Path to Computationally Sustainable
Neural Networks [68.8204255655161]
Small neural networks with a constrained number of trainable parameters, can be suitable resource-efficient candidates for many simple tasks.
We explore the diversity of the neurons within the hidden layer during the learning process.
We analyze how the diversity of the neurons affects predictions of the model.
arXiv Detail & Related papers (2021-09-20T15:12:16Z) - FF-NSL: Feed-Forward Neural-Symbolic Learner [70.978007919101]
This paper introduces a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FF-NSL)
FF-NSL integrates state-of-the-art ILP systems based on the Answer Set semantics, with neural networks, in order to learn interpretable hypotheses from labelled unstructured data.
arXiv Detail & Related papers (2021-06-24T15:38:34Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - How Neural Networks Extrapolate: From Feedforward to Graph Neural
Networks [80.55378250013496]
We study how neural networks trained by gradient descent extrapolate what they learn outside the support of the training distribution.
Graph Neural Networks (GNNs) have shown some success in more complex tasks.
arXiv Detail & Related papers (2020-09-24T17:48:59Z) - Rectified Linear Postsynaptic Potential Function for Backpropagation in
Deep Spiking Neural Networks [55.0627904986664]
Spiking Neural Networks (SNNs) usetemporal spike patterns to represent and transmit information, which is not only biologically realistic but also suitable for ultra-low-power event-driven neuromorphic implementation.
This paper investigates the contribution of spike timing dynamics to information encoding, synaptic plasticity and decision making, providing a new perspective to design of future DeepSNNs and neuromorphic hardware systems.
arXiv Detail & Related papers (2020-03-26T11:13:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.