On the Spectral Bias of Convolutional Neural Tangent and Gaussian
Process Kernels
- URL: http://arxiv.org/abs/2203.09255v1
- Date: Thu, 17 Mar 2022 11:23:18 GMT
- Title: On the Spectral Bias of Convolutional Neural Tangent and Gaussian
Process Kernels
- Authors: Amnon Geifman, Meirav Galun, David Jacobs, Ronen Basri
- Abstract summary: We study the properties of various over-parametrized convolutional neural architectures through their respective Gaussian process and tangent neural kernels.
We show that the eigenvalues decay hierarchically, quantify the rate of decay, and derive measures that reflect the composition of hierarchical features in these networks.
- Score: 24.99551134153653
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the properties of various over-parametrized convolutional neural
architectures through their respective Gaussian process and neural tangent
kernels. We prove that, with normalized multi-channel input and ReLU
activation, the eigenfunctions of these kernels with the uniform measure are
formed by products of spherical harmonics, defined over the channels of the
different pixels. We next use hierarchical factorizable kernels to bound their
respective eigenvalues. We show that the eigenvalues decay polynomially,
quantify the rate of decay, and derive measures that reflect the composition of
hierarchical features in these networks. Our results provide concrete
quantitative characterization of over-parameterized convolutional network
architectures.
Related papers
- Spectral complexity of deep neural networks [2.099922236065961]
We use the angular power spectrum of the limiting field to characterize the complexity of the network architecture.
On this basis, we classify neural networks as low-disorder, sparse, or high-disorder.
We show how this classification highlights a number of distinct features for standard activation functions, and in particular, sparsity properties of ReLU networks.
arXiv Detail & Related papers (2024-05-15T17:55:05Z) - Improving Expressive Power of Spectral Graph Neural Networks with Eigenvalue Correction [55.57072563835959]
spectral graph neural networks are characterized by filters.
We propose an eigenvalue correction strategy that can free filters from the constraints of repeated eigenvalue inputs.
arXiv Detail & Related papers (2024-01-28T08:12:00Z) - A theory of data variability in Neural Network Bayesian inference [0.70224924046445]
We provide a field-theoretic formalism which covers the generalization properties of infinitely wide networks.
We derive the generalization properties from the statistical properties of the input.
We show that data variability leads to a non-Gaussian action reminiscent of a ($varphi3+varphi4$)-theory.
arXiv Detail & Related papers (2023-07-31T14:11:32Z) - Deterministic equivalent of the Conjugate Kernel matrix associated to
Artificial Neural Networks [0.0]
We show that the empirical spectral distribution of the Conjugate Kernel converges to a deterministic limit.
More precisely we obtain a deterministic equivalent for its Stieltjes transform and its resolvent, with quantitative bounds involving both the dimension and the spectral parameter.
arXiv Detail & Related papers (2023-06-09T12:31:59Z) - Spectral Complexity-scaled Generalization Bound of Complex-valued Neural
Networks [78.64167379726163]
This paper is the first work that proves a generalization bound for the complex-valued neural network.
We conduct experiments by training complex-valued convolutional neural networks on different datasets.
arXiv Detail & Related papers (2021-12-07T03:25:25Z) - A deep learning driven pseudospectral PCE based FFT homogenization
algorithm for complex microstructures [68.8204255655161]
It is shown that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
It is shown, that the proposed method is able to predict central moments of interest while being magnitudes faster to evaluate than traditional approaches.
arXiv Detail & Related papers (2021-10-26T07:02:14Z) - Convolutional Filtering and Neural Networks with Non Commutative
Algebras [153.20329791008095]
We study the generalization of non commutative convolutional neural networks.
We show that non commutative convolutional architectures can be stable to deformations on the space of operators.
arXiv Detail & Related papers (2021-08-23T04:22:58Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Mehler's Formula, Branching Process, and Compositional Kernels of Deep
Neural Networks [3.167685495996986]
We utilize a connection between compositional kernels and branching processes via Mehler's formula to study deep neural networks.
We study the unscaled and rescaled limits of the compositional kernels and explore the different phases of the limiting behavior.
Explicit formulas on the eigenvalues of the compositional kernel are provided, which quantify the complexity of the corresponding kernel Hilbert space.
arXiv Detail & Related papers (2020-04-09T18:46:13Z) - Spectrum Dependent Learning Curves in Kernel Regression and Wide Neural
Networks [17.188280334580195]
We derive analytical expressions for the generalization performance of kernel regression as a function of the number of training samples.
Our expressions apply to wide neural networks due to an equivalence between training them and kernel regression with the Neural Kernel Tangent (NTK)
We verify our theory with simulations on synthetic data and MNIST dataset.
arXiv Detail & Related papers (2020-02-07T00:03:40Z) - Understanding Graph Neural Networks with Generalized Geometric
Scattering Transforms [67.88675386638043]
The scattering transform is a multilayered wavelet-based deep learning architecture that acts as a model of convolutional neural networks.
We introduce windowed and non-windowed geometric scattering transforms for graphs based upon a very general class of asymmetric wavelets.
We show that these asymmetric graph scattering transforms have many of the same theoretical guarantees as their symmetric counterparts.
arXiv Detail & Related papers (2019-11-14T17:23:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.