Diffusion Earth Mover's Distance and Distribution Embeddings
- URL: http://arxiv.org/abs/2102.12833v1
- Date: Thu, 25 Feb 2021 13:18:32 GMT
- Title: Diffusion Earth Mover's Distance and Distribution Embeddings
- Authors: Alexander Tong, Guillaume Huguet, Amine Natik, Kincaid MacDonald,
Manik Kuchroo, Ronald Coifman, Guy Wolf, Smita Krishnaswamy
- Abstract summary: Diffusion can be computed in $tildeO(n)$ time and is more accurate than similarly fast algorithms such as tree-baseds.
We show Diffusion is fully differentiable, making it amenable to future uses in gradient-descent frameworks such as deep neural networks.
- Score: 61.49248071384122
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new fast method of measuring distances between large numbers of
related high dimensional datasets called the Diffusion Earth Mover's Distance
(EMD). We model the datasets as distributions supported on common data graph
that is derived from the affinity matrix computed on the combined data. In such
cases where the graph is a discretization of an underlying Riemannian closed
manifold, we prove that Diffusion EMD is topologically equivalent to the
standard EMD with a geodesic ground distance. Diffusion EMD can be computed in
$\tilde{O}(n)$ time and is more accurate than similarly fast algorithms such as
tree-based EMDs. We also show Diffusion EMD is fully differentiable, making it
amenable to future uses in gradient-descent frameworks such as deep neural
networks. Finally, we demonstrate an application of Diffusion EMD to single
cell data collected from 210 COVID-19 patient samples at Yale New Haven
Hospital. Here, Diffusion EMD can derive distances between patients on the
manifold of cells at least two orders of magnitude faster than equally accurate
methods. This distance matrix between patients can be embedded into a higher
level patient manifold which uncovers structure and heterogeneity in patients.
More generally, Diffusion EMD is applicable to all datasets that are massively
collected in parallel in many medical and biological systems.
Related papers
- Robust Fiber ODF Estimation Using Deep Constrained Spherical
Deconvolution for Diffusion MRI [7.9283612449524155]
A common practice to model the measured DW-MRI signal is via fiber orientation distribution function (fODF)
measurement variabilities (e.g., inter- and intra-site variability, hardware performance, and sequence design) are inevitable during the acquisition of DW-MRI.
Most existing model-based methods (e.g., constrained spherical deconvolution (CSD)) and learning based methods (e.g., deep learning (DL)) do not explicitly consider such variabilities in fODF modeling.
We propose a novel data-driven deep constrained spherical deconvolution method to
arXiv Detail & Related papers (2023-06-05T14:06:40Z) - A Heat Diffusion Perspective on Geodesic Preserving Dimensionality
Reduction [66.21060114843202]
We propose a more general heat kernel based manifold embedding method that we call heat geodesic embeddings.
Results show that our method outperforms existing state of the art in preserving ground truth manifold distances.
We also showcase our method on single cell RNA-sequencing datasets with both continuum and cluster structure.
arXiv Detail & Related papers (2023-05-30T13:58:50Z) - Higher Order Gauge Equivariant CNNs on Riemannian Manifolds and
Applications [7.322121417864824]
We introduce a higher order generalization of the gauge equivariant convolution, dubbed a gauge equivariant Volterra network (GEVNet)
This allows us to model spatially extended nonlinear interactions within a given field while still maintaining equivariance to global isometries.
In the neuroimaging data experiments, the resulting two-part architecture is used to automatically discriminate between patients with Lewy Body Disease (DLB), Alzheimer's Disease (AD) and Parkinson's Disease (PD) from diffusion magnetic resonance images (dMRI)
arXiv Detail & Related papers (2023-05-26T06:02:31Z) - Score Approximation, Estimation and Distribution Recovery of Diffusion
Models on Low-Dimensional Data [68.62134204367668]
This paper studies score approximation, estimation, and distribution recovery of diffusion models, when data are supported on an unknown low-dimensional linear subspace.
We show that with a properly chosen neural network architecture, the score function can be both accurately approximated and efficiently estimated.
The generated distribution based on the estimated score function captures the data geometric structures and converges to a close vicinity of the data distribution.
arXiv Detail & Related papers (2023-02-14T17:02:35Z) - Bayesian Hyperbolic Multidimensional Scaling [2.5944208050492183]
We propose a Bayesian approach to multidimensional scaling when the low-dimensional manifold is hyperbolic.
A case-control likelihood approximation allows for efficient sampling from the posterior distribution in larger data settings.
We evaluate the proposed method against state-of-the-art alternatives using simulations, canonical reference datasets, Indian village network data, and human gene expression data.
arXiv Detail & Related papers (2022-10-26T23:34:30Z) - Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth
Mover's Distance [63.203951161394265]
In modern machine learning it is common to encounter large graphs that arise via interactions or similarities between observations in many domains.
We propose to compare and organize such datasets of graph signals by using an earth mover's distance (EMD) with a geodesic cost over the underlying graph.
In each case, we show that UDEMD-based embeddings find accurate distances that are highly efficient compared to other methods.
arXiv Detail & Related papers (2021-07-26T17:19:02Z) - Federated Deep AUC Maximization for Heterogeneous Data with a Constant
Communication Complexity [77.78624443410216]
We propose improved FDAM algorithms for detecting heterogeneous chest data.
A result of this paper is that the communication of the proposed algorithm is strongly independent of the number of machines and also independent of the accuracy level.
Experiments have demonstrated the effectiveness of our FDAM algorithm on benchmark datasets and on medical chest Xray images from different organizations.
arXiv Detail & Related papers (2021-02-09T04:05:19Z) - ManifoldNorm: Extending normalizations on Riemannian Manifolds [18.073864874996534]
We propose a general normalization techniques for manifold valued data.
We show that our proposed manifold normalization technique have special cases including popular batch norm and group norm techniques.
arXiv Detail & Related papers (2020-03-30T23:45:43Z) - Multifold Acceleration of Diffusion MRI via Slice-Interleaved Diffusion
Encoding (SIDE) [50.65891535040752]
We propose a diffusion encoding scheme, called Slice-Interleaved Diffusion.
SIDE, that interleaves each diffusion-weighted (DW) image volume with slices encoded with different diffusion gradients.
We also present a method based on deep learning for effective reconstruction of DW images from the highly slice-undersampled data.
arXiv Detail & Related papers (2020-02-25T14:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.