Related papers: On neural network kernels and the storage capacity problem

On neural network kernels and the storage capacity problem

URL: http://arxiv.org/abs/2201.04669v1
Date: Wed, 12 Jan 2022 19:47:30 GMT
Title: On neural network kernels and the storage capacity problem
Authors: Jacob A. Zavatone-Veth and Cengiz Pehlevan
Abstract summary: We reify the connection between work on the storage capacity problem in wide two-layer treelike neural networks and the rapidly-growing body of literature on kernel limits of wide neural networks.
Score: 16.244541005112747
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: In this short note, we reify the connection between work on the storage capacity problem in wide two-layer treelike neural networks and the rapidly-growing body of literature on kernel limits of wide neural networks. Concretely, we observe that the "effective order parameter" studied in the statistical mechanics literature is exactly equivalent to the infinite-width Neural Network Gaussian Process Kernel. This correspondence connects the expressivity and trainability of wide two-layer neural networks.

Related papers

Stochastic Gradient Descent for Two-layer Neural Networks [2.0349026069285423]
This paper presents a study on the convergence rates of the descent (SGD) algorithm when applied to overparameterized two-layer neural networks. Our approach combines the Tangent Kernel (NTK) approximation with convergence analysis in the Reproducing Kernel Space (RKHS) generated by NTK. Our research framework enables us to explore the intricate interplay between kernel methods and optimization processes, shedding light on the dynamics and convergence properties of neural networks.
arXiv Detail & Related papers (2024-07-10T13:58:57Z)
Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology. We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK) This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z)
Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters. Our approach enables a single model to encode neural computational graphs with diverse architectures. We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z)
On the Equivalence between Implicit and Explicit Neural Networks: A High-dimensional Viewpoint [6.790383517643622]
Implicit neural networks have demonstrated remarkable success in various tasks. There is a lack of theoretical analysis of the connections and differences between implicit and explicit networks.
arXiv Detail & Related papers (2023-08-31T03:28:43Z)
Addressing caveats of neural persistence with deep graph persistence [54.424983583720675]
We find that the variance of network weights and spatial concentration of large weights are the main factors that impact neural persistence. We propose an extension of the filtration underlying neural persistence to the whole neural network instead of single layers. This yields our deep graph persistence measure, which implicitly incorporates persistent paths through the network and alleviates variance-related issues.
arXiv Detail & Related papers (2023-07-20T13:34:11Z)
Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights. We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z)
Exploring the Approximation Capabilities of Multiplicative Neural Networks for Smooth Functions [9.936974568429173]
We consider two classes of target functions: generalized bandlimited functions and Sobolev-Type balls. Our results demonstrate that multiplicative neural networks can approximate these functions with significantly fewer layers and neurons. These findings suggest that multiplicative gates can outperform standard feed-forward layers and have potential for improving neural network design.
arXiv Detail & Related papers (2023-01-11T17:57:33Z)
Zonotope Domains for Lagrangian Neural Network Verification [102.13346781220383]
We decompose the problem of verifying a deep neural network into the verification of many 2-layer neural networks. Our technique yields bounds that improve upon both linear programming and Lagrangian-based verification techniques.
arXiv Detail & Related papers (2022-10-14T19:31:39Z)
Meta-Principled Family of Hyperparameter Scaling Strategies [9.89901717499058]
We calculate the scalings of dynamical observables -- network outputs, neural tangent kernels, and differentials of neural tangent kernels -- for wide and deep neural networks. We observe that various infinite-width limits examined in the literature correspond to the distinct corners of the interconnected web.
arXiv Detail & Related papers (2022-10-10T18:00:01Z)
Spiking neural network for nonlinear regression [68.8204255655161]
Spiking neural networks carry the potential for a massive reduction in memory and energy consumption. They introduce temporal and neuronal sparsity, which can be exploited by next-generation neuromorphic hardware. A framework for regression using spiking neural networks is proposed.
arXiv Detail & Related papers (2022-10-06T13:04:45Z)
On the approximation of functions by tanh neural networks [0.0]
We derive bounds on the error, in high-order Sobolev norms, incurred in the approximation of Sobolev-regular. We show that tanh neural networks with only two hidden layers suffice to approximate functions at comparable or better rates than much deeper ReLU neural networks.
arXiv Detail & Related papers (2021-04-18T19:30:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.