Accurately Modeling Biased Random Walks on Weighted Graphs Using
$\textit{Node2vec+}$
- URL: http://arxiv.org/abs/2109.08031v1
- Date: Wed, 15 Sep 2021 17:59:25 GMT
- Title: Accurately Modeling Biased Random Walks on Weighted Graphs Using
$\textit{Node2vec+}$
- Authors: Renming Liu, Matthew Hirn, Arjun Krishnan
- Abstract summary: We extend $textitnode2vec$ to $textitnode2vec+$ in a way that accounts for edge weights when calculating walk biases.
We show that $textitnode2vec+$ is more robust to additive noise than $textitnode2vec$ in weighted graphs.
We also demonstrate that $textitnode2vec+$ significantly outperforms $textitnode2vec$ on a commonly benchmarked multi-label dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Node embedding is a powerful approach for representing the structural role of
each node in a graph. $\textit{Node2vec}$ is a widely used method for node
embedding that works by exploring the local neighborhoods via biased random
walks on the graph. However, $\textit{node2vec}$ does not consider edge weights
when computing walk biases. This intrinsic limitation prevents
$\textit{node2vec}$ from leveraging all the information in weighted graphs and,
in turn, limits its application to many real-world networks that are weighted
and dense. Here, we naturally extend $\textit{node2vec}$ to
$\textit{node2vec+}$ in a way that accounts for edge weights when calculating
walk biases, but which reduces to $\textit{node2vec}$ in the cases of
unweighted graphs or unbiased walks. We empirically show that
$\textit{node2vec+}$ is more robust to additive noise than $\textit{node2vec}$
in weighted graphs using two synthetic datasets. We also demonstrate that
$\textit{node2vec+}$ significantly outperforms $\textit{node2vec}$ on a
commonly benchmarked multi-label dataset (Wikipedia). Furthermore, we test
$\textit{node2vec+}$ against GCN and GraphSAGE using various challenging gene
classification tasks on two protein-protein interaction networks. Despite some
clear advantages of GCN and GraphSAGE, they show comparable performance with
$\textit{node2vec+}$. Finally, $\textit{node2vec+}$ can be used as a general
approach for generating biased random walks, benefiting all existing methods
built on top of $\textit{node2vec}$. $\textit{Node2vec+}$ is implemented as
part of $\texttt{PecanPy}$, which is available at
https://github.com/krishnanlab/PecanPy .
Related papers
- Graph Sparsification via Mixture of Graphs [70.78196524207466]
We introduce Mixture-of-Graphs (MoG) to dynamically select tailored pruning solutions for each node.
MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.
Experiments on four large-scale OGB datasets and two superpixel datasets, equipped with five GNNs, demonstrate that MoG identifies subgraphs at higher sparsity levels.
arXiv Detail & Related papers (2024-05-23T07:40:21Z) - Node Embedding for Homophilous Graphs with ARGEW: Augmentation of Random
walks by Graph Edge Weights [2.2935273605606494]
ARGEW is a novel augmentation method for random walks that expands the corpus in such a way that nodes with larger edge weights end up with closer embeddings.
With several real-world networks, we demonstrate that with ARGEW, compared to not using it, the desired pattern that node pairs with larger edge weights have closer embeddings is much clearer.
arXiv Detail & Related papers (2023-08-11T06:19:23Z) - Fast Online Node Labeling for Very Large Graphs [11.700626862639131]
Current methods either invert a graph kernel runtime matrix with $mathcalO(n3)$ or $mathcalO(n2)$ space complexity or sample a large volume of random spanning trees.
We propose an improvement based on the textitonline relaxation technique introduced by a series of works.
arXiv Detail & Related papers (2023-05-25T17:13:08Z) - Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem.
We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z) - Dynamic Ranking and Translation Synchronization [3.946250592943285]
We study an extension of the emphtranslation synchronization problem, to the dynamic setting.
We propose two estimators -- one based on a smoothness-penalized least squares approach and the other based on projection onto the low frequency eigenspace of a suitable smoothness operator.
arXiv Detail & Related papers (2022-07-04T14:45:12Z) - Supervised Training of Conditional Monge Maps [107.78770597815242]
Optimal transport (OT) theory describes general principles to define and select, among many possible choices, the most efficient way to map a probability measure onto another.
We introduce CondOT, a multi-task approach to estimate a family of OT maps conditioned on a context variable.
We demonstrate the ability of CondOT to infer the effect of an arbitrary combination of genetic or therapeutic perturbations on single cells.
arXiv Detail & Related papers (2022-06-28T19:34:44Z) - Random Graph Matching in Geometric Models: the Case of Complete Graphs [21.689343447798677]
This paper studies the problem of matching two complete graphs with edge weights correlated through latent geometries.
We derive an approximate maximum likelihood estimator, which provably achieves, with high probability, perfect recovery of $pi*$.
As a side discovery, we show that the celebrated spectral algorithm of [Ume88] emerges as a further approximation to the maximum likelihood in the geometric model.
arXiv Detail & Related papers (2022-02-22T04:14:45Z) - AnchorGAE: General Data Clustering via $O(n)$ Bipartite Graph
Convolution [79.44066256794187]
We show how to convert a non-graph dataset into a graph by introducing the generative graph model, which is used to build graph convolution networks (GCNs)
A bipartite graph constructed by anchors is updated dynamically to exploit the high-level information behind data.
We theoretically prove that the simple update will lead to degeneration and a specific strategy is accordingly designed.
arXiv Detail & Related papers (2021-11-12T07:08:13Z) - Testing network correlation efficiently via counting trees [19.165942326142538]
We propose a new procedure for testing whether two networks are edge-correlated through some latent correspondence.
The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees.
arXiv Detail & Related papers (2021-10-22T14:47:20Z) - Nearly Horizon-Free Offline Reinforcement Learning [97.36751930393245]
We revisit offline reinforcement learning on episodic time-homogeneous Markov Decision Processes with $S$ states, $A$ actions and planning horizon $H$.
We obtain the first set of nearly $H$-free sample complexity bounds for evaluation and planning using the empirical MDPs.
arXiv Detail & Related papers (2021-03-25T18:52:17Z) - Adversarial Linear Contextual Bandits with Graph-Structured Side
Observations [80.95090605985042]
A learning agent repeatedly chooses from a set of $K$ actions after being presented with a $d$-dimensional context vector.
The agent incurs and observes the loss of the chosen action, but also observes the losses of its neighboring actions in the observation structures.
Two efficient algorithms are developed based on textttEXP3.
arXiv Detail & Related papers (2020-12-10T15:40:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.