Gradient Kernel Regression
- URL: http://arxiv.org/abs/2104.05874v1
- Date: Tue, 13 Apr 2021 00:32:34 GMT
- Title: Gradient Kernel Regression
- Authors: Matt Calder
- Abstract summary: This kernel is used to perform kernel regression.
The surprising thing is that the accuracy of that regression is independent of the accuracy of the underlying network.
- Score: 1.2250157368151429
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this article a surprising result is demonstrated using the neural tangent
kernel. This kernel is defined as the inner product of the vector of the
gradient of an underlying model evaluated at training points. This kernel is
used to perform kernel regression. The surprising thing is that the accuracy of
that regression is independent of the accuracy of the underlying network.
Related papers
- On Kernel Regression with Data-Dependent Kernels [0.0]
We consider kernel regression in which the kernel may be updated after seeing the training data.
Connections to the view of deep neural networks as data-dependent kernel learners are discussed.
arXiv Detail & Related papers (2022-09-04T20:46:01Z) - An Empirical Analysis of the Laplace and Neural Tangent Kernels [0.0]
The neural tangent kernel is a kernel function defined over the parameter distribution of an infinite width neural network.
We show that the Laplace kernel and neural tangent kernel share the same kernel Hilbert space in the space of $mathbbSd-1$.
arXiv Detail & Related papers (2022-08-07T16:18:02Z) - Neural Fields as Learnable Kernels for 3D Reconstruction [101.54431372685018]
We present a novel method for reconstructing implicit 3D shapes based on a learned kernel ridge regression.
Our technique achieves state-of-the-art results when reconstructing 3D objects and large scenes from sparse oriented points.
arXiv Detail & Related papers (2021-11-26T18:59:04Z) - Neural Networks as Kernel Learners: The Silent Alignment Effect [86.44610122423994]
Neural networks in the lazy training regime converge to kernel machines.
We show that this can indeed happen due to a phenomenon we term silent alignment.
We also demonstrate that non-whitened data can weaken the silent alignment effect.
arXiv Detail & Related papers (2021-10-29T18:22:46Z) - Kernel Mean Estimation by Marginalized Corrupted Distributions [96.9272743070371]
Estimating the kernel mean in a kernel Hilbert space is a critical component in many kernel learning algorithms.
We present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution.
arXiv Detail & Related papers (2021-07-10T15:11:28Z) - Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models.
This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models.
We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z) - Quantifying the Benefit of Using Differentiable Learning over Tangent
Kernels [51.18614735359657]
We show that gradient descent achieves small error only if a related tangent kernel method achieves a non-trivial advantage over random guessing.
We show that without these conditions, gradient descent can in fact learn with small error even when no kernel method, in particular using the tangent kernel, can achieve a non-trivial advantage over random guessing.
arXiv Detail & Related papers (2021-03-01T18:54:13Z) - On the Similarity between the Laplace and Neural Tangent Kernels [26.371904197642145]
We show that NTK for fully connected networks is closely related to the standard Laplace kernel.
Our results suggest that much insight about neural networks can be obtained from analysis of the well-known Laplace kernel.
arXiv Detail & Related papers (2020-07-03T09:48:23Z) - Optimal Rates of Distributed Regression with Imperfect Kernels [0.0]
We study the distributed kernel regression via the divide conquer and conquer approach.
We show that the kernel ridge regression can achieve rates faster than $N-1$ in the noise free setting.
arXiv Detail & Related papers (2020-06-30T13:00:16Z) - A Generalized Neural Tangent Kernel Analysis for Two-layer Neural
Networks [87.23360438947114]
We show that noisy gradient descent with weight decay can still exhibit a " Kernel-like" behavior.
This implies that the training loss converges linearly up to a certain accuracy.
We also establish a novel generalization error bound for two-layer neural networks trained by noisy gradient descent with weight decay.
arXiv Detail & Related papers (2020-02-10T18:56:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.