A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression
- URL: http://arxiv.org/abs/2410.17796v1
- Date: Wed, 23 Oct 2024 11:52:52 GMT
- Title: A Comprehensive Analysis on the Learning Curve in Kernel Ridge Regression
- Authors: Tin Sum Cheng, Aurelien Lucchi, Anastasis Kratsios, David Belius,
- Abstract summary: This paper conducts a comprehensive study of the learning curves of kernel ridge regression (KRR) under minimal assumptions.
We analyze the role of key properties of the kernel, such as its spectral eigen-decay, the characteristics of the eigenfunctions, and the smoothness of the kernel.
- Score: 6.749750044497731
- License:
- Abstract: This paper conducts a comprehensive study of the learning curves of kernel ridge regression (KRR) under minimal assumptions. Our contributions are three-fold: 1) we analyze the role of key properties of the kernel, such as its spectral eigen-decay, the characteristics of the eigenfunctions, and the smoothness of the kernel; 2) we demonstrate the validity of the Gaussian Equivalent Property (GEP), which states that the generalization performance of KRR remains the same when the whitened features are replaced by standard Gaussian vectors, thereby shedding light on the success of previous analyzes under the Gaussian Design Assumption; 3) we derive novel bounds that improve over existing bounds across a broad range of setting such as (in)dependent feature vectors and various combinations of eigen-decay rates in the over/underparameterized regimes.
Related papers
- High-Dimensional Kernel Methods under Covariate Shift: Data-Dependent Implicit Regularization [83.06112052443233]
This paper studies kernel ridge regression in high dimensions under covariate shifts.
By a bias-variance decomposition, we theoretically demonstrate that the re-weighting strategy allows for decreasing the variance.
For bias, we analyze the regularization of the arbitrary or well-chosen scale, showing that the bias can behave very differently under different regularization scales.
arXiv Detail & Related papers (2024-06-05T12:03:27Z) - Learning Analysis of Kernel Ridgeless Regression with Asymmetric Kernel Learning [33.34053480377887]
This paper enhances kernel ridgeless regression with Locally-Adaptive-Bandwidths (LAB) RBF kernels.
For the first time, we demonstrate that functions learned from LAB RBF kernels belong to an integral space of Reproducible Kernel Hilbert Spaces (RKHSs)
arXiv Detail & Related papers (2024-06-03T15:28:12Z) - A Duality Analysis of Kernel Ridge Regression in the Noiseless Regime [5.153104177051464]
We prove that KRR can attain the minimax optimal rate, which depends on both the eigenvalue decay of the associated kernel and the relative smoothness of target functions.
Our proof leverages a novel extension of the duality framework introduced by Chen et al. (2023), which could be useful in analyzing kernel-based methods beyond the scope of this work.
arXiv Detail & Related papers (2024-02-24T04:57:59Z) - Characterizing Overfitting in Kernel Ridgeless Regression Through the Eigenspectrum [6.749750044497731]
We prove the phenomena of tempered overfitting and catastrophic overfitting under the sub-Gaussian design assumption.
We also identify that the independence of the features plays an important role in guaranteeing tempered overfitting.
arXiv Detail & Related papers (2024-02-02T10:36:53Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Experimental Design for Linear Functionals in Reproducing Kernel Hilbert
Spaces [102.08678737900541]
We provide algorithms for constructing bias-aware designs for linear functionals.
We derive non-asymptotic confidence sets for fixed and adaptive designs under sub-Gaussian noise.
arXiv Detail & Related papers (2022-05-26T20:56:25Z) - Ridgeless Regression with Random Features [23.41536146432726]
We investigate the statistical properties of ridgeless regression with random features and gradient descent.
We propose a tunable kernel algorithm that optimize the spectral density of kernel during training.
arXiv Detail & Related papers (2022-05-01T14:25:08Z) - On the Benefits of Large Learning Rates for Kernel Methods [110.03020563291788]
We show that a phenomenon can be precisely characterized in the context of kernel methods.
We consider the minimization of a quadratic objective in a separable Hilbert space, and show that with early stopping, the choice of learning rate influences the spectral decomposition of the obtained solution.
arXiv Detail & Related papers (2022-02-28T13:01:04Z) - On the Double Descent of Random Features Models Trained with SGD [78.0918823643911]
We study properties of random features (RF) regression in high dimensions optimized by gradient descent (SGD)
We derive precise non-asymptotic error bounds of RF regression under both constant and adaptive step-size SGD setting.
We observe the double descent phenomenon both theoretically and empirically.
arXiv Detail & Related papers (2021-10-13T17:47:39Z) - Kernel Methods for Causal Functions: Dose, Heterogeneous, and
Incremental Response Curves [26.880628841819004]
We prove uniform consistency with improved finite sample rates via original analysis of generalized kernel ridge regression.
We extend our main results to counterfactual distributions and to causal functions identified by front and back door criteria.
arXiv Detail & Related papers (2020-10-10T00:53:11Z) - On Learning Rates and Schr\"odinger Operators [105.32118775014015]
We present a general theoretical analysis of the effect of the learning rate.
We find that the learning rate tends to zero for a broad non- neural class functions.
arXiv Detail & Related papers (2020-04-15T09:52:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.