Sketching the Heat Kernel: Using Gaussian Processes to Embed Data
- URL: http://arxiv.org/abs/2403.07929v1
- Date: Fri, 1 Mar 2024 22:56:19 GMT
- Title: Sketching the Heat Kernel: Using Gaussian Processes to Embed Data
- Authors: Anna C. Gilbert, Kevin O'Neill,
- Abstract summary: We introduce a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on realizations of a Gaussian process depending on the geometry of the data.
Our method demonstrates further advantage in its robustness to outliers.
- Score: 4.220336689294244
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper introduces a novel, non-deterministic method for embedding data in low-dimensional Euclidean space based on computing realizations of a Gaussian process depending on the geometry of the data. This type of embedding first appeared in (Adler et al, 2018) as a theoretical model for a generic manifold in high dimensions. In particular, we take the covariance function of the Gaussian process to be the heat kernel, and computing the embedding amounts to sketching a matrix representing the heat kernel. The Karhunen-Lo\`eve expansion reveals that the straight-line distances in the embedding approximate the diffusion distance in a probabilistic sense, avoiding the need for sharp cutoffs and maintaining some of the smaller-scale structure. Our method demonstrates further advantage in its robustness to outliers. We justify the approach with both theory and experiments.
Related papers
- von Mises Quasi-Processes for Bayesian Circular Regression [57.88921637944379]
We explore a family of expressive and interpretable distributions over circle-valued random functions.
The resulting probability model has connections with continuous spin models in statistical physics.
For posterior inference, we introduce a new Stratonovich-like augmentation that lends itself to fast Markov Chain Monte Carlo sampling.
arXiv Detail & Related papers (2024-06-19T01:57:21Z) - Learning on manifolds without manifold learning [0.0]
Function approximation based on data drawn randomly from an unknown distribution is an important problem in machine learning.
In this paper, we project the unknown manifold as a submanifold ambient hypersphere and study the question of constructing a one-shot approximation using specially designed kernels on the hypersphere.
arXiv Detail & Related papers (2024-02-20T03:27:53Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Implicit Manifold Gaussian Process Regression [49.0787777751317]
Gaussian process regression is widely used to provide well-calibrated uncertainty estimates.
It struggles with high-dimensional data because of the implicit low-dimensional manifold upon which the data actually lies.
In this paper we propose a technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way.
arXiv Detail & Related papers (2023-10-30T09:52:48Z) - Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes [57.396578974401734]
We introduce a principled framework for building a generative diffusion process on general manifold.
Instead of following the denoising approach of previous diffusion models, we construct a diffusion process using a mixture of bridge processes.
We develop a geometric understanding of the mixture process, deriving the drift as a weighted mean of tangent directions to the data points.
arXiv Detail & Related papers (2023-10-11T06:04:40Z) - A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z) - Kernelized Diffusion maps [2.817412580574242]
In this article, we build a different estimator of the Laplacian, via a reproducing kernel Hilbert space method.
We provide non-asymptotic statistical rates proving that the kernel estimator we build can circumvent the curse of dimensionality.
arXiv Detail & Related papers (2023-02-13T23:54:36Z) - Isotropic Gaussian Processes on Finite Spaces of Graphs [71.26737403006778]
We propose a principled way to define Gaussian process priors on various sets of unweighted graphs.
We go further to consider sets of equivalence classes of unweighted graphs and define the appropriate versions of priors thereon.
Inspired by applications in chemistry, we illustrate the proposed techniques on a real molecular property prediction task in the small data regime.
arXiv Detail & Related papers (2022-11-03T10:18:17Z) - Local Random Feature Approximations of the Gaussian Kernel [14.230653042112834]
We focus on the popular Gaussian kernel and on techniques to linearize kernel-based models by means of random feature approximations.
We show that such approaches yield poor results when modelling high-frequency data, and we propose a novel localization scheme that improves kernel approximations and downstream performance significantly.
arXiv Detail & Related papers (2022-04-12T09:52:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.