On the Eigenvalue Decay Rates of a Class of Neural-Network Related
Kernel Functions Defined on General Domains
- URL: http://arxiv.org/abs/2305.02657v4
- Date: Mon, 8 Jan 2024 11:23:36 GMT
- Title: On the Eigenvalue Decay Rates of a Class of Neural-Network Related
Kernel Functions Defined on General Domains
- Authors: Yicheng Li, Zixiong Yu, Guhan Chen, Qian Lin
- Abstract summary: We provide a strategy to determine the eigenvalue decay rate (EDR) of a large class of kernel functions defined on a general domain.
This class of kernel functions include but are not limited to the neural tangent kernel associated with neural networks with different depths and various activation functions.
- Score: 10.360517127652185
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we provide a strategy to determine the eigenvalue decay rate
(EDR) of a large class of kernel functions defined on a general domain rather
than $\mathbb S^{d}$. This class of kernel functions include but are not
limited to the neural tangent kernel associated with neural networks with
different depths and various activation functions. After proving that the
dynamics of training the wide neural networks uniformly approximated that of
the neural tangent kernel regression on general domains, we can further
illustrate the minimax optimality of the wide neural network provided that the
underground truth function $f\in [\mathcal H_{\mathrm{NTK}}]^{s}$, an
interpolation space associated with the RKHS $\mathcal{H}_{\mathrm{NTK}}$ of
NTK. We also showed that the overfitted neural network can not generalize well.
We believe our approach for determining the EDR of kernels might be also of
independent interests.
Related papers
- Novel Kernel Models and Exact Representor Theory for Neural Networks Beyond the Over-Parameterized Regime [52.00917519626559]
This paper presents two models of neural-networks and their training applicable to neural networks of arbitrary width, depth and topology.
We also present an exact novel representor theory for layer-wise neural network training with unregularized gradient descent in terms of a local-extrinsic neural kernel (LeNK)
This representor theory gives insight into the role of higher-order statistics in neural network training and the effect of kernel evolution in neural-network kernel models.
arXiv Detail & Related papers (2024-05-24T06:30:36Z) - Gradient Descent in Neural Networks as Sequential Learning in RKBS [63.011641517977644]
We construct an exact power-series representation of the neural network in a finite neighborhood of the initial weights.
We prove that, regardless of width, the training sequence produced by gradient descent can be exactly replicated by regularized sequential learning.
arXiv Detail & Related papers (2023-02-01T03:18:07Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - Extrapolation and Spectral Bias of Neural Nets with Hadamard Product: a
Polynomial Net Study [55.12108376616355]
The study on NTK has been devoted to typical neural network architectures, but is incomplete for neural networks with Hadamard products (NNs-Hp)
In this work, we derive the finite-width-K formulation for a special class of NNs-Hp, i.e., neural networks.
We prove their equivalence to the kernel regression predictor with the associated NTK, which expands the application scope of NTK.
arXiv Detail & Related papers (2022-09-16T06:36:06Z) - An Empirical Analysis of the Laplace and Neural Tangent Kernels [0.0]
The neural tangent kernel is a kernel function defined over the parameter distribution of an infinite width neural network.
We show that the Laplace kernel and neural tangent kernel share the same kernel Hilbert space in the space of $mathbbSd-1$.
arXiv Detail & Related papers (2022-08-07T16:18:02Z) - Uniform Generalization Bounds for Overparameterized Neural Networks [5.945320097465419]
We prove uniform generalization bounds for overparameterized neural networks in kernel regimes.
Our bounds capture the exact error rates depending on the differentiability of the activation functions.
We show the equivalence between the RKHS corresponding to the NT kernel and its counterpart corresponding to the Mat'ern family of kernels.
arXiv Detail & Related papers (2021-09-13T16:20:13Z) - Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network.
We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z) - Optimal Rates for Averaged Stochastic Gradient Descent under Neural
Tangent Kernel Regime [50.510421854168065]
We show that the averaged gradient descent can achieve the minimax optimal convergence rate.
We show that the target function specified by the NTK of a ReLU network can be learned at the optimal convergence rate.
arXiv Detail & Related papers (2020-06-22T14:31:37Z) - Avoiding Kernel Fixed Points: Computing with ELU and GELU Infinite
Networks [12.692279981822011]
We derive the covariance functions of multi-layer perceptrons with exponential linear units (ELU) and Gaussian error linear units (GELU)
We analyse the fixed-point dynamics of iterated kernels corresponding to a broad range of activation functions.
We find that unlike some previously studied neural network kernels, these new kernels exhibit non-trivial fixed-point dynamics.
arXiv Detail & Related papers (2020-02-20T01:25:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.