A mixture representation of the spectral distribution of isotropic kernels with application to random Fourier features
- URL: http://arxiv.org/abs/2411.02770v2
- Date: Thu, 23 Jan 2025 13:11:37 GMT
- Title: A mixture representation of the spectral distribution of isotropic kernels with application to random Fourier features
- Authors: Nicolas Langrené, Xavier Warin, Pierre Gruet,
- Abstract summary: We show that the spectral distribution of every positive definite isotropic kernel can be decomposed as a scale mixture of $alpha$-stable random vectors.
This constructive decomposition provides a simple and ready-to-use spectral sampling formula for every multivariate positive definite shift-invariant kernel.
This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques.
- Score: 0.0
- License:
- Abstract: Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we prove that the spectral distribution of every positive definite isotropic kernel can be decomposed as a scale mixture of $\alpha$-stable random vectors, and we identify the scaling distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for every multivariate positive definite shift-invariant kernel, including exponential power kernels, generalized Mat\'ern kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we show that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution. This provides a very simple way to adapt existing random Fourier features software based on Gaussian kernels to any positive definite shift-invariant kernel. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.
Related papers
- Orthogonal Random Features: Explicit Forms and Sharp Inequalities [5.8317379706611385]
We analyze the bias and the variance of the kernel approximation based on orthogonal random features.
We provide explicit expressions for these quantities using normalized Bessel functions.
arXiv Detail & Related papers (2023-10-11T10:40:43Z) - Kernel Learning by quantum annealer [0.966840768820136]
We propose an application of the Boltzmann machine to the kernel matrix used in various machine-learning techniques.
We show that it is possible to create a spectral distribution that could not be feasible with the Gaussian distribution.
arXiv Detail & Related papers (2023-04-20T08:08:03Z) - Variational Autoencoder Kernel Interpretation and Selection for
Classification [59.30734371401315]
This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder.
In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel.
choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels.
arXiv Detail & Related papers (2022-09-10T17:22:53Z) - Unified Fourier-based Kernel and Nonlinearity Design for Equivariant
Networks on Homogeneous Spaces [52.424621227687894]
We introduce a unified framework for group equivariant networks on homogeneous spaces.
We take advantage of the sparsity of Fourier coefficients of the lifted feature fields.
We show that other methods treating features as the Fourier coefficients in the stabilizer subgroup are special cases of our activation.
arXiv Detail & Related papers (2022-06-16T17:59:01Z) - Transformer with Fourier Integral Attentions [18.031977028559282]
We propose a new class of transformers in which the dot-product kernels are replaced by the novel generalized Fourier integral kernels.
Compared to the conventional transformers with dot-product attention, FourierFormers attain better accuracy and reduce the redundancy between attention heads.
We empirically corroborate the advantages of FourierFormers over the baseline transformers in a variety of practical applications including language modeling and image classification.
arXiv Detail & Related papers (2022-06-01T03:06:21Z) - Hybrid Random Features [60.116392415715275]
We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs)
HRFs automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest.
arXiv Detail & Related papers (2021-10-08T20:22:59Z) - A Note on Optimizing Distributions using Kernel Mean Embeddings [94.96262888797257]
Kernel mean embeddings represent probability measures by their infinite-dimensional mean embeddings in a reproducing kernel Hilbert space.
We show that when the kernel is characteristic, distributions with a kernel sum-of-squares density are dense.
We provide algorithms to optimize such distributions in the finite-sample setting.
arXiv Detail & Related papers (2021-06-18T08:33:45Z) - Scalable Variational Gaussian Processes via Harmonic Kernel
Decomposition [54.07797071198249]
We introduce a new scalable variational Gaussian process approximation which provides a high fidelity approximation while retaining general applicability.
We demonstrate that, on a range of regression and classification problems, our approach can exploit input space symmetries such as translations and reflections.
Notably, our approach achieves state-of-the-art results on CIFAR-10 among pure GP models.
arXiv Detail & Related papers (2021-06-10T18:17:57Z) - Spectral clustering under degree heterogeneity: a case for the random
walk Laplacian [83.79286663107845]
This paper shows that graph spectral embedding using the random walk Laplacian produces vector representations which are completely corrected for node degree.
In the special case of a degree-corrected block model, the embedding concentrates about K distinct points, representing communities.
arXiv Detail & Related papers (2021-05-03T16:36:27Z) - The Minecraft Kernel: Modelling correlated Gaussian Processes in the
Fourier domain [3.6526103325150383]
We present a family of kernel that can approximate any stationary multi-output kernel to arbitrary precision.
The proposed family of kernel represents the first multi-output generalisation of the spectral mixture kernel.
arXiv Detail & Related papers (2021-03-11T20:54:51Z) - Scaling up Kernel Ridge Regression via Locality Sensitive Hashing [6.704115928005158]
We introduce a weighted version of random binning features and show that the corresponding kernel function generates smooth Gaussian processes.
We show that our weighted random binning features provide a spectral approximation to the corresponding kernel matrix, leading to efficient algorithms for kernel ridge regression.
arXiv Detail & Related papers (2020-03-21T21:41:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.