Kernel interpolation generalizes poorly
- URL: http://arxiv.org/abs/2303.15809v2
- Date: Tue, 1 Aug 2023 11:53:27 GMT
- Title: Kernel interpolation generalizes poorly
- Authors: Yicheng Li, Haobo Zhang and Qian Lin
- Abstract summary: We show that for any $varepsilon>0$, the error of kernel generalization is lower bounded by $Omega(n-varepsilon)$.
As a direct, we can show that overfitted wide neural networks defined on the sphere generalize poorly.
- Score: 14.569829985753346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One of the most interesting problems in the recent renaissance of the studies
in kernel regression might be whether the kernel interpolation can generalize
well, since it may help us understand the `benign overfitting henomenon'
reported in the literature on deep networks. In this paper, under mild
conditions, we show that for any $\varepsilon>0$, the generalization error of
kernel interpolation is lower bounded by $\Omega(n^{-\varepsilon})$. In other
words, the kernel interpolation generalizes poorly for a large class of
kernels. As a direct corollary, we can show that overfitted wide neural
networks defined on the sphere generalize poorly.
Related papers
- The phase diagram of kernel interpolation in large dimensions [8.707305374058794]
generalization ability of kernel in large dimensions might be one of the most interesting problems in the recent renaissance of kernel regression.
We fully characterized the exact order of both the variance and bias of large-dimensional kernel under various source conditions $sgeq 0$.
We determined the regions in $(s,gamma)$-plane where the kernel is minimax optimal, sub-optimal and inconsistent.
arXiv Detail & Related papers (2024-04-19T03:04:06Z) - Generalization in Kernel Regression Under Realistic Assumptions [41.345620270267446]
We provide rigorous bounds for common kernels and for any amount of regularization, noise, any input dimension, and any number of samples.
Our results imply benign overfitting in high input dimensions, nearly tempered overfitting in fixed dimensions, and explicit convergence rates for regularized regression.
As a by-product, we obtain time-dependent bounds for neural networks trained in the kernel regime.
arXiv Detail & Related papers (2023-12-26T10:55:20Z) - Neural Networks with Sparse Activation Induced by Large Bias: Tighter Analysis with Bias-Generalized NTK [86.45209429863858]
We study training one-hidden-layer ReLU networks in the neural tangent kernel (NTK) regime.
We show that the neural networks possess a different limiting kernel which we call textitbias-generalized NTK
We also study various properties of the neural networks with this new kernel.
arXiv Detail & Related papers (2023-01-01T02:11:39Z) - An Empirical Analysis of the Laplace and Neural Tangent Kernels [0.0]
The neural tangent kernel is a kernel function defined over the parameter distribution of an infinite width neural network.
We show that the Laplace kernel and neural tangent kernel share the same kernel Hilbert space in the space of $mathbbSd-1$.
arXiv Detail & Related papers (2022-08-07T16:18:02Z) - Neural Networks as Kernel Learners: The Silent Alignment Effect [86.44610122423994]
Neural networks in the lazy training regime converge to kernel machines.
We show that this can indeed happen due to a phenomenon we term silent alignment.
We also demonstrate that non-whitened data can weaken the silent alignment effect.
arXiv Detail & Related papers (2021-10-29T18:22:46Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - Kernel Mean Estimation by Marginalized Corrupted Distributions [96.9272743070371]
Estimating the kernel mean in a kernel Hilbert space is a critical component in many kernel learning algorithms.
We present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution.
arXiv Detail & Related papers (2021-07-10T15:11:28Z) - Redundant representations help generalization in wide neural networks [71.38860635025907]
We study the last hidden layer representations of various state-of-the-art convolutional neural networks.
We find that if the last hidden representation is wide enough, its neurons tend to split into groups that carry identical information, and differ from each other only by statistically independent noise.
arXiv Detail & Related papers (2021-06-07T10:18:54Z) - How rotational invariance of common kernels prevents generalization in
high dimensions [8.508198765617196]
Kernel ridge regression is well-known to achieve minimax optimal rates in low-dimensional settings.
Recent work establishes consistency for kernel regression under certain assumptions on the ground truth function and the distribution of the input data.
arXiv Detail & Related papers (2021-04-09T08:27:37Z) - Kernelized Classification in Deep Networks [49.47339560731506]
We propose a kernelized classification layer for deep networks.
We advocate a nonlinear classification layer by using the kernel trick on the softmax cross-entropy loss function during training.
We show the usefulness of the proposed nonlinear classification layer on several datasets and tasks.
arXiv Detail & Related papers (2020-12-08T21:43:19Z) - Spectral Bias and Task-Model Alignment Explain Generalization in Kernel
Regression and Infinitely Wide Neural Networks [17.188280334580195]
Generalization beyond a training dataset is a main goal of machine learning.
Recent observations in deep neural networks contradict conventional wisdom from classical statistics.
We show that more data may impair generalization when noisy or not expressible by the kernel.
arXiv Detail & Related papers (2020-06-23T17:53:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.