A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction
- URL: http://arxiv.org/abs/2305.19043v1
- Date: Tue, 30 May 2023 13:58:50 GMT
- Title: A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction
- Authors: Guillaume Huguet, Alexander Tong, Edward De Brouwer, Yanlei Zhang, Guy
Wolf, Ian Adelstein, Smita Krishnaswamy
- Abstract summary: We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
- Score: 66.21060114843202
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Diffusion-based manifold learning methods have proven useful in
representation learning and dimensionality reduction of modern high
dimensional, high throughput, noisy datasets. Such datasets are especially
present in fields like biology and physics. While it is thought that these
methods preserve underlying manifold structure of data by learning a proxy for
geodesic distances, no specific theoretical links have been established. Here,
we establish such a link via results in Riemannian geometry explicitly
connecting heat diffusion to manifold distances. In this process, we also
formulate a more general heat kernel based manifold embedding method that we
call heat geodesic embeddings. This novel perspective makes clearer the choices
available in manifold learning and denoising. Results show that our method
outperforms existing state of the art in preserving ground truth manifold
distances, and preserving cluster structure in toy datasets. We also showcase
our method on single cell RNA-sequencing datasets with both continuum and
cluster structure, where our method enables interpolation of withheld
timepoints of data. Finally, we show that parameters of our more general method
can be configured to give results similar to PHATE (a state-of-the-art
diffusion based manifold learning method) as well as SNE (an
attraction/repulsion neighborhood based method that forms the basis of t-SNE).
Related papers
- Pullback Flow Matching on Data Manifolds [10.187244125099479]
Pullback Flow Matching (PFM) is a framework for generative modeling on data manifold.
We demonstrate PFM's effectiveness through applications in synthetic, data dynamics and protein sequence data, generating novel proteins with specific properties.
This method shows strong potential for drug discovery and materials science, where generating novel samples with specific properties is of great interest.
arXiv Detail & Related papers (2024-10-06T16:41:26Z) - Scaling Riemannian Diffusion Models [68.52820280448991]
We show that our method enables us to scale to high dimensional tasks on nontrivial manifold.
We model QCD densities on $SU(n)$ lattices and contrastively learned embeddings on high dimensional hyperspheres.
arXiv Detail & Related papers (2023-10-30T21:27:53Z) - Generative Modeling on Manifolds Through Mixture of Riemannian Diffusion Processes [57.396578974401734]
We introduce a principled framework for building a generative diffusion process on general manifold.
Instead of following the denoising approach of previous diffusion models, we construct a diffusion process using a mixture of bridge processes.
We develop a geometric understanding of the mixture process, deriving the drift as a weighted mean of tangent directions to the data points.
arXiv Detail & Related papers (2023-10-11T06:04:40Z) - Manifold-augmented Eikonal Equations: Geodesic Distances and Flows on
Differentiable Manifolds [5.0401589279256065]
We show how the geometry of a manifold impacts the distance field, and exploit the geodesic flow to obtain globally length-minimising curves directly.
This work opens opportunities for statistics and reduced-order modelling on differentiable manifold.
arXiv Detail & Related papers (2023-10-09T21:11:13Z) - Short and Straight: Geodesics on Differentiable Manifolds [6.85316573653194]
In this work, we first analyse existing methods for computing length-minimising geodesics.
Second, we propose a model-based parameterisation for distance fields and geodesic flows on continuous manifold.
Third, we develop a curvature-based training mechanism, sampling and scaling points in regions of the manifold exhibiting larger values of the Ricci scalar.
arXiv Detail & Related papers (2023-05-24T15:09:41Z) - Time-inhomogeneous diffusion geometry and topology [69.55228523791897]
Diffusion condensation is a time-inhomogeneous process where each step first computes and then applies a diffusion operator to the data.
We theoretically analyze the convergence and evolution of this process from geometric, spectral, and topological perspectives.
Our work gives theoretical insights into the convergence of diffusion condensation, and shows that it provides a link between topological and geometric data analysis.
arXiv Detail & Related papers (2022-03-28T16:06:17Z) - Inferring Manifolds From Noisy Data Using Gaussian Processes [17.166283428199634]
Most existing manifold learning algorithms replace the original data with lower dimensional coordinates.
This article proposes a new methodology for addressing these problems, allowing the estimated manifold between fitted data points.
arXiv Detail & Related papers (2021-10-14T15:50:38Z) - Learning Manifold Implicitly via Explicit Heat-Kernel Learning [63.354671267760516]
We propose the concept of implicit manifold learning, where manifold information is implicitly obtained by learning the associated heat kernel.
The learned heat kernel can be applied to various kernel-based machine learning models, including deep generative models (DGM) for data generation and Stein Variational Gradient Descent for Bayesian inference.
arXiv Detail & Related papers (2020-10-05T03:39:58Z) - Manifold Learning via Manifold Deflation [105.7418091051558]
dimensionality reduction methods provide a valuable means to visualize and interpret high-dimensional data.
Many popular methods can fail dramatically, even on simple two-dimensional Manifolds.
This paper presents an embedding method for a novel, incremental tangent space estimator that incorporates global structure as coordinates.
Empirically, we show our algorithm recovers novel and interesting embeddings on real-world and synthetic datasets.
arXiv Detail & Related papers (2020-07-07T10:04:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.