On the Identifiability and Interpretability of Gaussian Process Models
- URL: http://arxiv.org/abs/2310.17023v1
- Date: Wed, 25 Oct 2023 22:00:29 GMT
- Title: On the Identifiability and Interpretability of Gaussian Process Models
- Authors: Jiawen Chen, Wancen Mu, Yun Li, Didong Li
- Abstract summary: We critically examine the prevalent practice of using additive mixtures of Mat'ern kernels in single-output Gaussian process (GP) models.
We show that the smoothness of a mixture of Mat'ern kernels is determined by the least smooth component and that a GP with such a kernel is effectively equivalent to the least smooth kernel component.
We show that $A$ is identifiable up to a multiplicative constant, suggesting that multiplicative mixtures are well suited for multi-output tasks.
- Score: 8.417178903130244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper, we critically examine the prevalent practice of using additive
mixtures of Mat\'ern kernels in single-output Gaussian process (GP) models and
explore the properties of multiplicative mixtures of Mat\'ern kernels for
multi-output GP models. For the single-output case, we derive a series of
theoretical results showing that the smoothness of a mixture of Mat\'ern
kernels is determined by the least smooth component and that a GP with such a
kernel is effectively equivalent to the least smooth kernel component.
Furthermore, we demonstrate that none of the mixing weights or parameters
within individual kernel components are identifiable. We then turn our
attention to multi-output GP models and analyze the identifiability of the
covariance matrix $A$ in the multiplicative kernel $K(x,y) = AK_0(x,y)$, where
$K_0$ is a standard single output kernel such as Mat\'ern. We show that $A$ is
identifiable up to a multiplicative constant, suggesting that multiplicative
mixtures are well suited for multi-output tasks. Our findings are supported by
extensive simulations and real applications for both single- and multi-output
settings. This work provides insight into kernel selection and interpretation
for GP models, emphasizing the importance of choosing appropriate kernel
structures for different tasks.
Related papers
- Posterior Contraction Rates for Mat\'ern Gaussian Processes on
Riemannian Manifolds [51.68005047958965]
We show that intrinsic Gaussian processes can achieve better performance in practice.
Our work shows that finer-grained analyses are needed to distinguish between different levels of data-efficiency.
arXiv Detail & Related papers (2023-09-19T20:30:58Z) - Dimensionality Reduction for General KDE Mode Finding [12.779486428760373]
Finding the mode of a high dimensional probability distribution $D$ is a fundamental problem in statistics and data analysis.
We show that there is no time algorithm for finding the mode of a kernel density estimate, unless $mathitP = mathitNP$.
arXiv Detail & Related papers (2023-05-30T05:39:59Z) - A mixed-categorical correlation kernel for Gaussian process [0.0]
We present a kernel-based approach that extends continuous exponential kernels to handle mixed-categorical variables.
The proposed kernel leads to a new GP surrogate that generalizes both the continuous relaxation and the Gower distance based GP models.
arXiv Detail & Related papers (2022-11-15T16:13:04Z) - Variational Autoencoder Kernel Interpretation and Selection for
Classification [59.30734371401315]
This work proposed kernel selection approaches for probabilistic classifiers based on features produced by the convolutional encoder of a variational autoencoder.
In the proposed implementation, each latent variable was sampled from the distribution associated with a single kernel of the last encoder's convolution layer, as an individual distribution was created for each kernel.
choosing relevant features on the sampled latent variables makes it possible to perform kernel selection, filtering the uninformative features and kernels.
arXiv Detail & Related papers (2022-09-10T17:22:53Z) - An Equivalence Principle for the Spectrum of Random Inner-Product Kernel
Matrices with Polynomial Scalings [21.727073594338297]
This study is motivated by applications in machine learning and statistics.
We establish the weak limit of the empirical distribution of these random matrices in a scaling regime.
Our results can be characterized as the free additive convolution between a Marchenko-Pastur law and a semicircle law.
arXiv Detail & Related papers (2022-05-12T18:50:21Z) - Beyond Parallel Pancakes: Quasi-Polynomial Time Guarantees for
Non-Spherical Gaussian Mixtures [9.670578317106182]
We consider mixtures of $kgeq 2$ Gaussian components with unknown means and unknown covariance (identical for all components) that are well-separated.
We show that this kind of hardness can only appear if mixing weights are allowed to be exponentially small.
We develop an algorithm based on the sum-of-squares method with running time quasi-polynomial in the minimum mixing weight.
arXiv Detail & Related papers (2021-12-10T10:51:44Z) - Kernel Identification Through Transformers [54.3795894579111]
Kernel selection plays a central role in determining the performance of Gaussian Process (GP) models.
This work addresses the challenge of constructing custom kernel functions for high-dimensional GP regression models.
We introduce a novel approach named KITT: Kernel Identification Through Transformers.
arXiv Detail & Related papers (2021-06-15T14:32:38Z) - The Minecraft Kernel: Modelling correlated Gaussian Processes in the
Fourier domain [3.6526103325150383]
We present a family of kernel that can approximate any stationary multi-output kernel to arbitrary precision.
The proposed family of kernel represents the first multi-output generalisation of the spectral mixture kernel.
arXiv Detail & Related papers (2021-03-11T20:54:51Z) - Kernel learning approaches for summarising and combining posterior
similarity matrices [68.8204255655161]
We build upon the notion of the posterior similarity matrix (PSM) in order to suggest new approaches for summarising the output of MCMC algorithms for Bayesian clustering models.
A key contribution of our work is the observation that PSMs are positive semi-definite, and hence can be used to define probabilistically-motivated kernel matrices.
arXiv Detail & Related papers (2020-09-27T14:16:14Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - SimpleMKKM: Simple Multiple Kernel K-means [49.500663154085586]
We propose a simple yet effective multiple kernel clustering algorithm, termed simple multiple kernel k-means (SimpleMKKM)
Our criterion is given by an intractable minimization-maximization problem in the kernel coefficient and clustering partition matrix.
We theoretically analyze the performance of SimpleMKKM in terms of its clustering generalization error.
arXiv Detail & Related papers (2020-05-11T10:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.