On the Approximation of Kernel functions
- URL: http://arxiv.org/abs/2403.06731v1
- Date: Mon, 11 Mar 2024 13:50:07 GMT
- Title: On the Approximation of Kernel functions
- Authors: Paul Dommel and Alois Pichler
- Abstract summary: The paper addresses approximations of the kernel itself.
For the Hilbert Gauss kernel on the unit cube, the paper establishes an upper bound of the associated eigenfunctions.
This improvement confirms low rank approximation methods such as the Nystr"om method.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Various methods in statistical learning build on kernels considered in
reproducing kernel Hilbert spaces. In applications, the kernel is often
selected based on characteristics of the problem and the data. This kernel is
then employed to infer response variables at points, where no explanatory data
were observed. The data considered here are located in compact sets in higher
dimensions and the paper addresses approximations of the kernel itself. The new
approach considers Taylor series approximations of radial kernel functions. For
the Gauss kernel on the unit cube, the paper establishes an upper bound of the
associated eigenfunctions, which grows only polynomially with respect to the
index. The novel approach substantiates smaller regularization parameters than
considered in the literature, overall leading to better approximations. This
improvement confirms low rank approximation methods such as the Nystr\"om
method.
Related papers
- Optimal Kernel Choice for Score Function-based Causal Discovery [92.65034439889872]
We propose a kernel selection method within the generalized score function that automatically selects the optimal kernel that best fits the data.
We conduct experiments on both synthetic data and real-world benchmarks, and the results demonstrate that our proposed method outperforms kernel selection methods.
arXiv Detail & Related papers (2024-07-14T09:32:20Z) - MEP: Multiple Kernel Learning Enhancing Relative Positional Encoding Length Extrapolation [5.298814565953444]
Relative position encoding methods address the length extrapolation challenge exclusively through the implementation of a single kernel function.
This study proposes a novel relative positional encoding method, called MEP, which employs a weighted average to combine distinct kernel functions.
We present two distinct versions of our method: a parameter-free variant that requires no new learnable parameters, and a parameterized variant capable of integrating state-of-the-art techniques.
arXiv Detail & Related papers (2024-03-26T13:38:06Z) - Decentralized Riemannian Conjugate Gradient Method on the Stiefel
Manifold [59.73080197971106]
This paper presents a first-order conjugate optimization method that converges faster than the steepest descent method.
It aims to achieve global convergence over the Stiefel manifold.
arXiv Detail & Related papers (2023-08-21T08:02:16Z) - Learning "best" kernels from data in Gaussian process regression. With
application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques.
A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning.
A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z) - Local Random Feature Approximations of the Gaussian Kernel [14.230653042112834]
We focus on the popular Gaussian kernel and on techniques to linearize kernel-based models by means of random feature approximations.
We show that such approaches yield poor results when modelling high-frequency data, and we propose a novel localization scheme that improves kernel approximations and downstream performance significantly.
arXiv Detail & Related papers (2022-04-12T09:52:36Z) - Kernel Mean Estimation by Marginalized Corrupted Distributions [96.9272743070371]
Estimating the kernel mean in a kernel Hilbert space is a critical component in many kernel learning algorithms.
We present a new kernel mean estimator, called the marginalized kernel mean estimator, which estimates kernel mean under the corrupted distribution.
arXiv Detail & Related papers (2021-07-10T15:11:28Z) - Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models.
This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models.
We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Towards Unbiased Random Features with Lower Variance For Stationary
Indefinite Kernels [26.57122949130266]
Our algorithm achieves lower variance and approximation error compared with the existing kernel approximation methods.
With better approximation to the originally selected kernels, improved classification accuracy and regression ability is obtained.
arXiv Detail & Related papers (2021-04-13T13:56:50Z) - Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method [76.73096213472897]
We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
arXiv Detail & Related papers (2020-02-21T00:43:06Z) - RFN: A Random-Feature Based Newton Method for Empirical Risk
Minimization in Reproducing Kernel Hilbert Spaces [14.924672048447334]
Large-scale finite-sum problems can be solved using efficient variants of Newton method, where the Hessian is approximated via sub-samples of data.
In this paper, we observe that for this class of problems, one can naturally use kernel approximation to speed up the Newton method.
We provide a novel second-order algorithm that enjoys local superlinear convergence and global linear convergence.
arXiv Detail & Related papers (2020-02-12T01:14:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.