An Empirical Analysis of the Laplace and Neural Tangent Kernels
- URL: http://arxiv.org/abs/2208.03761v1
- Date: Sun, 7 Aug 2022 16:18:02 GMT
- Title: An Empirical Analysis of the Laplace and Neural Tangent Kernels
- Authors: Ronaldas Paulius Lencevicius
- Abstract summary: The neural tangent kernel is a kernel function defined over the parameter distribution of an infinite width neural network.
We show that the Laplace kernel and neural tangent kernel share the same kernel Hilbert space in the space of $mathbbSd-1$.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The neural tangent kernel is a kernel function defined over the parameter
distribution of an infinite width neural network. Despite the impracticality of
this limit, the neural tangent kernel has allowed for a more direct study of
neural networks and a gaze through the veil of their black box. More recently,
it has been shown theoretically that the Laplace kernel and neural tangent
kernel share the same reproducing kernel Hilbert space in the space of
$\mathbb{S}^{d-1}$ alluding to their equivalence. In this work, we analyze the
practical equivalence of the two kernels. We first do so by matching the
kernels exactly and then by matching posteriors of a Gaussian process.
Moreover, we analyze the kernels in $\mathbb{R}^d$ and experiment with them in
the task of regression.
Related papers
- Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - A Unified Kernel for Neural Network Learning [4.0759204898334715]
We present the Unified Neural Kernel (UNK), which characterizes the learning dynamics of neural networks with gradient descents.
UNK maintains the limiting properties of both NNGP and NTK, exhibiting behaviors akin to NTK with a finite learning step.
We also theoretically characterize the uniform tightness and learning convergence of the UNK kernel.
arXiv Detail & Related papers (2024-03-26T07:55:45Z) - An Exact Kernel Equivalence for Finite Classification Models [1.4777718769290527]
We compare our exact representation to the well-known Neural Tangent Kernel (NTK) and discuss approximation error relative to the NTK.
We use this exact kernel to show that our theoretical contribution can provide useful insights into the predictions made by neural networks.
arXiv Detail & Related papers (2023-08-01T20:22:53Z) - On the Eigenvalue Decay Rates of a Class of Neural-Network Related
Kernel Functions Defined on General Domains [10.360517127652185]
We provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain.
This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions.
arXiv Detail & Related papers (2023-05-04T08:54:40Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - On the Neural Tangent Kernel Analysis of Randomly Pruned Neural Networks [91.3755431537592]
We study how random pruning of the weights affects a neural network's neural kernel (NTK)
In particular, this work establishes an equivalence of the NTKs between a fully-connected neural network and its randomly pruned version.
arXiv Detail & Related papers (2022-03-27T15:22:19Z) - Neural Networks as Kernel Learners: The Silent Alignment Effect [86.44610122423994]
Neural networks in the lazy training regime converge to kernel machines.
We show that this can indeed happen due to a phenomenon we term silent alignment.
We also demonstrate that non-whitened data can weaken the silent alignment effect.
arXiv Detail & Related papers (2021-10-29T18:22:46Z) - Nuclei with up to $\boldsymbol{A=6}$ nucleons with artificial neural
network wave functions [52.77024349608834]
We use artificial neural networks to compactly represent the wave functions of nuclei.
We benchmark their binding energies, point-nucleon densities, and radii with the highly accurate hyperspherical harmonics method.
arXiv Detail & Related papers (2021-08-15T23:02:39Z) - Deep Neural Tangent Kernel and Laplace Kernel Have the Same RKHS [10.578438886506076]
We prove that the exponential power kernel with a smaller power (making the kernel less smooth) leads to a larger RKHS.
We also prove that the reproducing kernel Hilbert spaces (RKHS) of a deep neural tangent kernel and the Laplace kernel include the same set of functions.
arXiv Detail & Related papers (2020-09-22T16:58:26Z) - On the Similarity between the Laplace and Neural Tangent Kernels [26.371904197642145]
We show that NTK for fully connected networks is closely related to the standard Laplace kernel.
Our results suggest that much insight about neural networks can be obtained from analysis of the well-known Laplace kernel.
arXiv Detail & Related papers (2020-07-03T09:48:23Z) - A Generalized Neural Tangent Kernel Analysis for Two-layer Neural
Networks [87.23360438947114]
We show that noisy gradient descent with weight decay can still exhibit a " Kernel-like" behavior.
This implies that the training loss converges linearly up to a certain accuracy.
We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay.
arXiv Detail & Related papers (2020-02-10T18:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.