Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth
Mover's Distance
- URL: http://arxiv.org/abs/2107.12334v1
- Date: Mon, 26 Jul 2021 17:19:02 GMT
- Title: Embedding Signals on Knowledge Graphs with Unbalanced Diffusion Earth
Mover's Distance
- Authors: Alexander Tong and Guillaume Huguet and Dennis Shung and Amine Natik
and Manik Kuchroo and Guillaume Lajoie and Guy Wolf and Smita Krishnaswamy
- Abstract summary: In modern machine learning it is common to encounter large graphs that arise via interactions or similarities between observations in many domains.
We propose to compare and organize such datasets of graph signals by using an earth mover's distance (EMD) with a geodesic cost over the underlying graph.
In each case, we show that UDEMD-based embeddings find accurate distances that are highly efficient compared to other methods.
- Score: 63.203951161394265
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In modern relational machine learning it is common to encounter large graphs
that arise via interactions or similarities between observations in many
domains. Further, in many cases the target entities for analysis are actually
signals on such graphs. We propose to compare and organize such datasets of
graph signals by using an earth mover's distance (EMD) with a geodesic cost
over the underlying graph. Typically, EMD is computed by optimizing over the
cost of transporting one probability distribution to another over an underlying
metric space. However, this is inefficient when computing the EMD between many
signals. Here, we propose an unbalanced graph earth mover's distance that
efficiently embeds the unbalanced EMD on an underlying graph into an $L^1$
space, whose metric we call unbalanced diffusion earth mover's distance
(UDEMD). This leads us to an efficient nearest neighbors kernel over many
signals defined on a large graph. Next, we show how this gives distances
between graph signals that are robust to noise. Finally, we apply this to
organizing patients based on clinical notes who are modelled as signals on the
SNOMED-CT medical knowledge graph, embedding lymphoblast cells modeled as
signals on a gene graph, and organizing genes modeled as signals over a large
peripheral blood mononuclear (PBMC) cell graph. In each case, we show that
UDEMD-based embeddings find accurate distances that are highly efficient
compared to other methods.
Related papers
- Graph Fourier MMD for Signals on Graphs [67.68356461123219]
We propose a novel distance between distributions and signals on graphs.
GFMMD is defined via an optimal witness function that is both smooth on the graph and maximizes difference in expectation.
We showcase it on graph benchmark datasets as well as on single cell RNA-sequencing data analysis.
arXiv Detail & Related papers (2023-06-05T00:01:17Z) - Stable and Transferable Hyper-Graph Neural Networks [95.07035704188984]
We introduce an architecture for processing signals supported on hypergraphs via graph neural networks (GNNs)
We provide a framework for bounding the stability and transferability error of GNNs across arbitrary graphs via spectral similarity.
arXiv Detail & Related papers (2022-11-11T23:44:20Z) - Tree Mover's Distance: Bridging Graph Metrics and Stability of Graph
Neural Networks [54.225220638606814]
We propose a pseudometric for attributed graphs, the Tree Mover's Distance (TMD), and study its relation to generalization.
First, we show that TMD captures properties relevant to graph classification; a simple TMD-SVM performs competitively with standard GNNs.
Second, we relate TMD to generalization of GNNs under distribution shifts, and show that it correlates well with performance drop under such shifts.
arXiv Detail & Related papers (2022-10-04T21:03:52Z) - AMA-GCN: Adaptive Multi-layer Aggregation Graph Convolutional Network
for Disease Prediction [20.19380805655623]
We propose an encoder that automatically selects the appropriate phenotypic measures according to their spatial distribution.
We also propose a novel graph convolution network architecture using multi-layer aggregation mechanism.
arXiv Detail & Related papers (2021-06-16T12:13:23Z) - Diffusion Earth Mover's Distance and Distribution Embeddings [61.49248071384122]
Diffusion can be computed in $tildeO(n)$ time and is more accurate than similarly fast algorithms such as tree-baseds.
We show Diffusion is fully differentiable, making it amenable to future uses in gradient-descent frameworks such as deep neural networks.
arXiv Detail & Related papers (2021-02-25T13:18:32Z) - A Hierarchical Graph Signal Processing Approach to Inference from
Spatiotemporal Signals [14.416786768268233]
Motivated by the emerging area of graph signal processing (GSP), we introduce a novel method to draw inference from signals.
In this paper we leverage techniques to develop a hierarchical feature extraction approach.
We test our approach on the intracranial EEG (iEEG) data set of the K aggle seizure detection contest.
arXiv Detail & Related papers (2020-10-25T17:08:13Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.