Related papers: On the Identifiability and Interpretability of Gaussian Process Models

On the Identifiability and Interpretability of Gaussian Process Models

URL: http://arxiv.org/abs/2310.17023v1
Date: Wed, 25 Oct 2023 22:00:29 GMT
Title: On the Identifiability and Interpretability of Gaussian Process Models
Authors: Jiawen Chen, Wancen Mu, Yun Li, Didong Li
Abstract summary: We critically examine the prevalent practice of using additive mixtures of Mat'ern kernels in single-output Gaussian process (GP) models. We show that the smoothness of a mixture of Mat'ern kernels is determined by the least smooth component and that a GP with such a kernel is effectively equivalent to the least smooth kernel component. We show that $A$ is identifiable up to a multiplicative constant, suggesting that multiplicative mixtures are well suited for multi-output tasks.
Score: 8.417178903130244
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In this paper, we critically examine the prevalent practice of using additive mixtures of Mat\'ern kernels in single-output Gaussian process (GP) models and explore the properties of multiplicative mixtures of Mat\'ern kernels for multi-output GP models. For the single-output case, we derive a series of theoretical results showing that the smoothness of a mixture of Mat\'ern kernels is determined by the least smooth component and that a GP with such a kernel is effectively equivalent to the least smooth kernel component. Furthermore, we demonstrate that none of the mixing weights or parameters within individual kernel components are identifiable. We then turn our attention to multi-output GP models and analyze the identifiability of the covariance matrix $A$ in the multiplicative kernel $K(x,y) = AK_0(x,y)$, where $K_0$ is a standard single output kernel such as Mat\'ern. We show that $A$ is identifiable up to a multiplicative constant, suggesting that multiplicative mixtures are well suited for multi-output tasks. Our findings are supported by extensive simulations and real applications for both single- and multi-output settings. This work provides insight into kernel selection and interpretation for GP models, emphasizing the importance of choosing appropriate kernel structures for different tasks.

Related papers

Elastic-Net Multiple Kernel Learning: Combining Multiple Data Sources for Prediction [4.293341792967467]
elastic-net regularized MKL (ENMKL) is especially valuable when model interpretability is critical.<n>We introduce an alternative ENMKL formulation that yields a simple analytical update for the kernel weights.<n>We implement these ENMKL algorithms against $l1$-norm MKL and against SVM (or KRR) trained on the unweighted sum of kernels.
arXiv Detail & Related papers (2025-12-12T13:33:17Z)
Feature maps for the Laplacian kernel and its generalizations [3.671202973761375]
Unlike the Gaussian kernel, the Laplacian kernel is not separable. We provide random features for the Laplacian kernel and its two generalizations. We demonstrate the efficacy of these random feature maps on real datasets.
arXiv Detail & Related papers (2025-02-21T16:36:20Z)
New random projections for isotropic kernels using stable spectral distributions [0.0]
We decompose spectral kernel distributions as a scale mixture of $alpha$-stable random vectors. Results have broad applications for support vector machines, kernel ridge regression, and other kernel-based machine learning techniques.
arXiv Detail & Related papers (2024-11-05T03:28:01Z)
Posterior Contraction Rates for Mat\'ern Gaussian Processes on Riemannian Manifolds [51.68005047958965]
We show that intrinsic Gaussian processes can achieve better performance in practice. Our work shows that finer-grained analyses are needed to distinguish between different levels of data-efficiency.
arXiv Detail & Related papers (2023-09-19T20:30:58Z)
Dimensionality Reduction for General KDE Mode Finding [12.779486428760373]
Finding the mode of a high dimensional probability distribution $D$ is a fundamental problem in statistics and data analysis. We show that there is no time algorithm for finding the mode of a kernel density estimate, unless $mathitP = mathitNP$.
arXiv Detail & Related papers (2023-05-30T05:39:59Z)
Variational Autoencoder Kernel Interpretation and Selection for Classification [59.30734371401315]
This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder. In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel. choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels.
arXiv Detail & Related papers (2022-09-10T17:22:53Z)
An Equivalence Principle for the Spectrum of Random Inner-Product Kernel Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics. We establish the weak limit of the empirical distribution of these random matrices in a scaling regime. Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z)
Beyond Parallel Pancakes: Quasi-Polynomial Time Guarantees for Non-Spherical Gaussian Mixtures [9.670578317106182]
We consider mixtures of $kgeq 2$ Gaussian components with unknown means and unknown covariance (identical for all components) that are well-separated. We show that this kind of hardness can only appear if mixing weights are allowed to be exponentially small. We develop an algorithm based on the sum-of-squares method with running time quasi-polynomial in the minimum mixing weight.
arXiv Detail & Related papers (2021-12-10T10:51:44Z)
Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models. This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models. We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z)
The Minecraft Kernel: Modelling correlated Gaussian Processes in the Fourier domain [3.6526103325150383]
We present a family of kernel that can approximate any stationary multi-output kernel to arbitrary precision. The proposed family of kernel represents the first multi-output generalisation of the spectral mixture kernel.
arXiv Detail & Related papers (2021-03-11T20:54:51Z)
Kernel learning approaches for summarising and combining posterior similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models. A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z)
Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting. We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$. Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
SimpleMKKM: Simple Multiple Kernel K-means [49.500663154085586]
We propose a simple yet effective multiple kernel clustering algorithm, termed simple multiple kernel k-means (SimpleMKKM) Our criterion is given by an intractable minimization-maximization problem in the kernel coefficient and clustering partition matrix. We theoretically analyze the performance of SimpleMKKM in terms of its clustering generalization error.
arXiv Detail & Related papers (2020-05-11T10:06:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.