Inferring Manifolds From Noisy Data Using Gaussian Processes
- URL: http://arxiv.org/abs/2110.07478v3
- Date: Fri, 24 May 2024 22:35:29 GMT
- Title: Inferring Manifolds From Noisy Data Using Gaussian Processes
- Authors: David B Dunson, Nan Wu,
- Abstract summary: Most existing manifold learning algorithms replace the original data with lower dimensional coordinates.
This article proposes a new methodology for addressing these problems, allowing the estimated manifold between fitted data points.
- Score: 17.166283428199634
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In analyzing complex datasets, it is often of interest to infer lower dimensional structure underlying the higher dimensional observations. As a flexible class of nonlinear structures, it is common to focus on Riemannian manifolds. Most existing manifold learning algorithms replace the original data with lower dimensional coordinates without providing an estimate of the manifold in the observation space or using the manifold to denoise the original data. This article proposes a new methodology for addressing these problems, allowing interpolation of the estimated manifold between fitted data points. The proposed approach is motivated by novel theoretical properties of local covariance matrices constructed from noisy samples on a manifold. Our results enable us to turn a global manifold reconstruction problem into a local regression problem, allowing application of Gaussian processes for probabilistic manifold reconstruction. In addition to theory justifying the algorithm, we provide simulated and real data examples to illustrate the performance.
Related papers
- Persistent de Rham-Hodge Laplacians in Eulerian representation for manifold topological learning [7.0103981121698355]
We introduce persistent de Rham-Hodge Laplacian, or persistent Hodge Laplacian, for manifold topological learning.
Our PHLs are constructed in the Eulerian representation via structure-persevering Cartesian grids.
As a proof-of-principle application, we consider the prediction of protein-ligand binding affinities with two benchmark datasets.
arXiv Detail & Related papers (2024-08-01T01:15:52Z) - Distributional Reduction: Unifying Dimensionality Reduction and Clustering with Gromov-Wasserstein [56.62376364594194]
Unsupervised learning aims to capture the underlying structure of potentially large and high-dimensional datasets.
In this work, we revisit these approaches under the lens of optimal transport and exhibit relationships with the Gromov-Wasserstein problem.
This unveils a new general framework, called distributional reduction, that recovers DR and clustering as special cases and allows addressing them jointly within a single optimization problem.
arXiv Detail & Related papers (2024-02-03T19:00:19Z) - Implicit Manifold Gaussian Process Regression [49.0787777751317]
Gaussian process regression is widely used to provide well-calibrated uncertainty estimates.
It struggles with high-dimensional data because of the implicit low-dimensional manifold upon which the data actually lies.
In this paper we propose a technique capable of inferring implicit structure directly from data (labeled and unlabeled) in a fully differentiable way.
arXiv Detail & Related papers (2023-10-30T09:52:48Z) - A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z) - VTAE: Variational Transformer Autoencoder with Manifolds Learning [144.0546653941249]
Deep generative models have demonstrated successful applications in learning non-linear data distributions through a number of latent variables.
The nonlinearity of the generator implies that the latent space shows an unsatisfactory projection of the data space, which results in poor representation learning.
We show that geodesics and accurate computation can substantially improve the performance of deep generative models.
arXiv Detail & Related papers (2023-04-03T13:13:19Z) - Convolutional Filtering on Sampled Manifolds [122.06927400759021]
We show that convolutional filtering on a sampled manifold converges to continuous manifold filtering.
Our findings are further demonstrated empirically on a problem of navigation control.
arXiv Detail & Related papers (2022-11-20T19:09:50Z) - Normal-bundle Bootstrap [2.741266294612776]
We present a method that generates new data which preserve the geometric structure of a given data set.
Inspired by algorithms for manifold learning and concepts in differential geometry, our method decomposes the underlying probability measure into a marginalized measure.
We apply our method to the inference of density ridge and related statistics, and data augmentation to reduce overfitting.
arXiv Detail & Related papers (2020-07-27T21:14:19Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z) - Sample complexity and effective dimension for regression on manifolds [13.774258153124205]
We consider the theory of regression on a manifold using kernel reproducing Hilbert space methods.
We show that certain spaces of smooth functions on a manifold are effectively finite-dimensional, with a complexity that scales according to the manifold dimension.
arXiv Detail & Related papers (2020-06-13T14:09:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.