On the Similarity between the Laplace and Neural Tangent Kernels
- URL: http://arxiv.org/abs/2007.01580v2
- Date: Sat, 14 Nov 2020 10:45:23 GMT
- Title: On the Similarity between the Laplace and Neural Tangent Kernels
- Authors: Amnon Geifman, Abhay Yadav, Yoni Kasten, Meirav Galun, David Jacobs,
Ronen Basri
- Abstract summary: We show that NTK for fully connected networks is closely related to the standard Laplace kernel.
Our results suggest that much insight about neural networks can be obtained from analysis of the well-known Laplace kernel.
- Score: 26.371904197642145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent theoretical work has shown that massively overparameterized neural
networks are equivalent to kernel regressors that use Neural Tangent
Kernels(NTK). Experiments show that these kernel methods perform similarly to
real neural networks. Here we show that NTK for fully connected networks is
closely related to the standard Laplace kernel. We show theoretically that for
normalized data on the hypersphere both kernels have the same eigenfunctions
and their eigenvalues decay polynomially at the same rate, implying that their
Reproducing Kernel Hilbert Spaces (RKHS) include the same sets of functions.
This means that both kernels give rise to classes of functions with the same
smoothness properties. The two kernels differ for data off the hypersphere, but
experiments indicate that when data is properly normalized these differences
are not significant. Finally, we provide experiments on real data comparing NTK
and the Laplace kernel, along with a larger class of{\gamma}-exponential
kernels. We show that these perform almost identically. Our results suggest
that much insight about neural networks can be obtained from analysis of the
well-known Laplace kernel, which has a simple closed-form.
Related papers
- Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - An Exact Kernel Equivalence for Finite Classification Models [1.4777718769290527]
We compare our exact representation to the well-known Neural Tangent Kernel (NTK) and discuss approximation error relative to the NTK.
We use this exact kernel to show that our theoretical contribution can provide useful insights into the predictions made by neural networks.
arXiv Detail & Related papers (2023-08-01T20:22:53Z) - On the Eigenvalue Decay Rates of a Class of Neural-Network Related
Kernel Functions Defined on General Domains [10.360517127652185]
We provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain.
This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions.
arXiv Detail & Related papers (2023-05-04T08:54:40Z) - On Kernel Regression with Data-Dependent Kernels [0.0]
We consider kernel regression in which the kernel may be updated after seeing the training data.
Connections to the view of deep neural networks as data-dependent kernel learners are discussed.
arXiv Detail & Related papers (2022-09-04T20:46:01Z) - An Empirical Analysis of the Laplace and Neural Tangent Kernels [0.0]
The neural tangent kernel is a kernel function defined over the parameter distribution of an infinite width neural network.
We show that the Laplace kernel and neural tangent kernel share the same kernel Hilbert space in the space of $mathbbSd-1$.
arXiv Detail & Related papers (2022-08-07T16:18:02Z) - Neural Fields as Learnable Kernels for 3D Reconstruction [101.54431372685018]
We present a novel method for reconstructing implicit 3D shapes based on a learned kernel ridge regression.
Our technique achieves state-of-the-art results when reconstructing 3D objects and large scenes from sparse oriented points.
arXiv Detail & Related papers (2021-11-26T18:59:04Z) - Neural Networks as Kernel Learners: The Silent Alignment Effect [86.44610122423994]
Neural networks in the lazy training regime converge to kernel machines.
We show that this can indeed happen due to a phenomenon we term silent alignment.
We also demonstrate that non-whitened data can weaken the silent alignment effect.
arXiv Detail & Related papers (2021-10-29T18:22:46Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Isolation Distributional Kernel: A New Tool for Point & Group Anomaly
Detection [76.1522587605852]
Isolation Distributional Kernel (IDK) is a new way to measure the similarity between two distributions.
We demonstrate IDK's efficacy and efficiency as a new tool for kernel based anomaly detection for both point and group anomalies.
arXiv Detail & Related papers (2020-09-24T12:25:43Z) - Neural Kernels Without Tangents [34.527798084824575]
We present an algebra for creating "compositional" kernels from bags of features.
We show that these operations correspond to many of the building blocks of "neural tangent kernels (NTK)"
arXiv Detail & Related papers (2020-03-04T18:25:41Z) - Learning Deep Kernels for Non-Parametric Two-Sample Tests [50.92621794426821]
We propose a class of kernel-based two-sample tests, which aim to determine whether two sets of samples are drawn from the same distribution.
Our tests are constructed from kernels parameterized by deep neural nets, trained to maximize test power.
arXiv Detail & Related papers (2020-02-21T03:54:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.