Related papers: An Exact Finite-dimensional Explicit Feature Map for Kernel Functions

An Exact Finite-dimensional Explicit Feature Map for Kernel Functions

URL: http://arxiv.org/abs/2410.12635v1
Date: Wed, 16 Oct 2024 14:55:11 GMT
Title: An Exact Finite-dimensional Explicit Feature Map for Kernel Functions
Authors: Kamaledin Ghiasi-Shirazi, Mohammadreza Qaraei,
Abstract summary: Kernel methods in machine learning use a kernel function that takes two data points as input and returns their inner product after mapping them to a Hilbert space, implicitly and without actually computing the mapping. In this paper, given an arbitrary kernel function, we introduce an explicit, finite-dimensional feature map for any arbitrary kernel function. The existence of this explicit mapping allows for kernelized algorithms to be formulated in their primal form, without the need for the kernel trick or the dual representation.
Score: 0.1227734309612871
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Kernel methods in machine learning use a kernel function that takes two data points as input and returns their inner product after mapping them to a Hilbert space, implicitly and without actually computing the mapping. For many kernel functions, such as Gaussian and Laplacian kernels, the feature space is known to be infinite-dimensional, making operations in this space possible only implicitly. This implicit nature necessitates algorithms to be expressed using dual representations and the kernel trick. In this paper, given an arbitrary kernel function, we introduce an explicit, finite-dimensional feature map for any arbitrary kernel function that ensures the inner product of data points in the feature space equals the kernel function value, during both training and testing. The existence of this explicit mapping allows for kernelized algorithms to be formulated in their primal form, without the need for the kernel trick or the dual representation. As a first application, we demonstrate how to derive kernelized machine learning algorithms directly, without resorting to the dual representation, and apply this method specifically to PCA. As another application, without any changes to the t-SNE algorithm and its implementation, we use it for visualizing the feature space of kernel functions.

Related papers

Feature maps for the Laplacian kernel and its generalizations [3.671202973761375]
Unlike the Gaussian kernel, the Laplacian kernel is not separable. We provide random features for the Laplacian kernel and its two generalizations. We demonstrate the efficacy of these random feature maps on real datasets.
arXiv Detail & Related papers (2025-02-21T16:36:20Z)
Spectral Truncation Kernels: Noncommutativity in $C^*$-algebraic Kernel Machines [12.11705128358537]
We propose a new class of positive definite kernels based on the spectral truncation. We show that it is a governing factor leading to performance enhancement. We also propose a deep learning perspective to increase the representation capacity of spectral truncation kernels.
arXiv Detail & Related papers (2024-05-28T04:47:12Z)
Neural Operators with Localized Integral and Differential Kernels [77.76991758980003]
We present a principled approach to operator learning that can capture local features under two frameworks. We prove that we obtain differential operators under an appropriate scaling of the kernel values of CNNs. To obtain local integral operators, we utilize suitable basis representations for the kernels based on discrete-continuous convolutions.
arXiv Detail & Related papers (2024-02-26T18:59:31Z)
Fast Kernel Summation in High Dimensions via Slicing and Fourier Transforms [0.0]
Kernel-based methods are heavily used in machine learning. They suffer from $O(N2)$ complexity in the number $N$ of considered data points. We propose an approximation procedure, which reduces this complexity to $O(N)$.
arXiv Detail & Related papers (2024-01-16T10:31:27Z)
Joint Embedding Self-Supervised Learning in the Kernel Regime [21.80241600638596]
Self-supervised learning (SSL) produces useful representations of data without access to any labels for classifying the data. We extend this framework to incorporate algorithms based on kernel methods where embeddings are constructed by linear maps acting on the feature space of a kernel. We analyze our kernel model on small datasets to identify common features of self-supervised learning algorithms and gain theoretical insights into their performance on downstream tasks.
arXiv Detail & Related papers (2022-09-29T15:53:19Z)
Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
In specific regimes, neural networks trained by gradient descent behave like kernel methods. In practice, it is known that neural networks strongly outperform their associated kernels.
arXiv Detail & Related papers (2022-06-30T09:24:02Z)
Learning "best" kernels from data in Gaussian process regression. With application to aerodynamics [0.4588028371034406]
We introduce algorithms to select/design kernels in Gaussian process regression/kriging surrogate modeling techniques. A first class of algorithms is kernel flow, which was introduced in a context of classification in machine learning. A second class of algorithms is called spectral kernel ridge regression, and aims at selecting a "best" kernel such that the norm of the function to be approximated is minimal.
arXiv Detail & Related papers (2022-06-03T07:50:54Z)
Understanding of Kernels in CNN Models by Suppressing Irrelevant Visual Features in Images [55.60727570036073]
The lack of precisely interpreting kernels in convolutional neural networks (CNNs) is one main obstacle to wide applications of deep learning models in real scenarios. A simple yet effective optimization method is proposed to interpret the activation of any kernel of interest in CNN models.
arXiv Detail & Related papers (2021-08-25T05:48:44Z)
Taming Nonconvexity in Kernel Feature Selection---Favorable Properties of the Laplace Kernel [77.73399781313893]
A challenge is to establish the objective function of kernel-based feature selection. The gradient-based algorithms available for non-global optimization are only able to guarantee convergence to local minima.
arXiv Detail & Related papers (2021-06-17T11:05:48Z)
Reproducing Kernel Hilbert Space, Mercer's Theorem, Eigenfunctions, Nystr\"om Method, and Use of Kernels in Machine Learning: Tutorial and Survey [5.967999555890417]
We start with reviewing the history of kernels in functional analysis and machine learning. We introduce types of use of kernels in machine learning including kernel methods, kernel learning by semi-definite programming, Hilbert-Schmidt independence criterion, maximum mean discrepancy, kernel mean embedding, and kernel dimensionality reduction. This paper can be useful for various fields of science including machine learning, dimensionality reduction, functional analysis in mathematics, and mathematical physics in quantum mechanics.
arXiv Detail & Related papers (2021-06-15T21:29:12Z)
Random Features for the Neural Tangent Kernel [57.132634274795066]
We propose an efficient feature map construction of the Neural Tangent Kernel (NTK) of fully-connected ReLU network. We show that dimension of the resulting features is much smaller than other baseline feature map constructions to achieve comparable error bounds both in theory and practice.
arXiv Detail & Related papers (2021-04-03T09:08:12Z)
Isolation Distributional Kernel: A New Tool for Point & Group Anomaly Detection [76.1522587605852]
Isolation Distributional Kernel (IDK) is a new way to measure the similarity between two distributions. We demonstrate IDK's efficacy and efficiency as a new tool for kernel based anomaly detection for both point and group anomalies.
arXiv Detail & Related papers (2020-09-24T12:25:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.