Accurately Modeling Biased Random Walks on Weighted Graphs Using
$\textit{Node2vec+}$
- URL: http://arxiv.org/abs/2109.08031v1
- Date: Wed, 15 Sep 2021 17:59:25 GMT
- Title: Accurately Modeling Biased Random Walks on Weighted Graphs Using
$\textit{Node2vec+}$
- Authors: Renming Liu, Matthew Hirn, Arjun Krishnan
- Abstract summary: We extend $textitnode2vec$ to $textitnode2vec+$ in a way that accounts for edge weights when calculating walk biases.
We show that $textitnode2vec+$ is more robust to additive noise than $textitnode2vec$ in weighted graphs.
We also demonstrate that $textitnode2vec+$ significantly outperforms $textitnode2vec$ on a commonly benchmarked multi-label dataset.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Node embedding is a powerful approach for representing the structural role of
each node in a graph. $\textit{Node2vec}$ is a widely used method for node
embedding that works by exploring the local neighborhoods via biased random
walks on the graph. However, $\textit{node2vec}$ does not consider edge weights
when computing walk biases. This intrinsic limitation prevents
$\textit{node2vec}$ from leveraging all the information in weighted graphs and,
in turn, limits its application to many real-world networks that are weighted
and dense. Here, we naturally extend $\textit{node2vec}$ to
$\textit{node2vec+}$ in a way that accounts for edge weights when calculating
walk biases, but which reduces to $\textit{node2vec}$ in the cases of
unweighted graphs or unbiased walks. We empirically show that
$\textit{node2vec+}$ is more robust to additive noise than $\textit{node2vec}$
in weighted graphs using two synthetic datasets. We also demonstrate that
$\textit{node2vec+}$ significantly outperforms $\textit{node2vec}$ on a
commonly benchmarked multi-label dataset (Wikipedia). Furthermore, we test
$\textit{node2vec+}$ against GCN and GraphSAGE using various challenging gene
classification tasks on two protein-protein interaction networks. Despite some
clear advantages of GCN and GraphSAGE, they show comparable performance with
$\textit{node2vec+}$. Finally, $\textit{node2vec+}$ can be used as a general
approach for generating biased random walks, benefiting all existing methods
built on top of $\textit{node2vec}$. $\textit{Node2vec+}$ is implemented as
part of $\texttt{PecanPy}$, which is available at
https://github.com/krishnanlab/PecanPy .
Related papers
- Improved convergence rate of kNN graph Laplacians [11.93971616098517]
General class of $k$NN graph where the graph affinity is $W_ij = epsilon-d/2 .
We prove the point-wise convergence of the $k$NN graph Laplacian to the limiting manifold operator.
arXiv Detail & Related papers (2024-10-30T17:01:00Z) - Graph Unfolding and Sampling for Transitory Video Summarization via Gershgorin Disc Alignment [48.137527345353625]
User-generated videos (UGVs) uploaded from mobile phones to social media sites like YouTube and TikTok are short and non-repetitive.
We summarize a transitory UGV into several discs in linear time via fast graph sampling based on Gershgorin disc alignment (GDA)
We show that our algorithm achieves comparable or better video summarization performance compared to state-of-the-art methods, at a substantially reduced complexity.
arXiv Detail & Related papers (2024-08-03T20:08:02Z) - Neural network learns low-dimensional polynomials with SGD near the information-theoretic limit [75.4661041626338]
We study the problem of gradient descent learning of a single-index target function $f_*(boldsymbolx) = textstylesigma_*left(langleboldsymbolx,boldsymbolthetarangleright)$<n>We prove that a two-layer neural network optimized by an SGD-based algorithm learns $f_*$ with a complexity that is not governed by information exponents.
arXiv Detail & Related papers (2024-06-03T17:56:58Z) - Graph Sparsification via Mixture of Graphs [67.40204130771967]
We introduce Mixture-of-Graphs (MoG) to dynamically select tailored pruning solutions for each node.
MoG incorporates multiple sparsifier experts, each characterized by unique sparsity levels and pruning criteria, and selects the appropriate experts for each node.
Experiments on four large-scale OGB datasets and two superpixel datasets, equipped with five GNNs, demonstrate that MoG identifies subgraphs at higher sparsity levels.
arXiv Detail & Related papers (2024-05-23T07:40:21Z) - Minimizing Polarization in Noisy Leader-Follower Opinion Dynamics [17.413930396663833]
We consider a problem of polarization optimization for the leader-follower opinion dynamics in a noisy social network.
We adopt the popular leader-follower DeGroot model, where the opinion of every leader is identical and remains unchanged, while the opinion of every follower is subject to white noise.
We show that the objective function is monotone and supermodular. We then propose a simple greedy algorithm with an approximation factor $1-1/e$ that approximately solves the problem in $O((n-q)3)$ time.
arXiv Detail & Related papers (2023-08-14T08:52:24Z) - Node Embedding for Homophilous Graphs with ARGEW: Augmentation of Random
walks by Graph Edge Weights [2.2935273605606494]
ARGEW is a novel augmentation method for random walks that expands the corpus in such a way that nodes with larger edge weights end up with closer embeddings.
With several real-world networks, we demonstrate that with ARGEW, compared to not using it, the desired pattern that node pairs with larger edge weights have closer embeddings is much clearer.
arXiv Detail & Related papers (2023-08-11T06:19:23Z) - $\ell_p$-Regression in the Arbitrary Partition Model of Communication [59.89387020011663]
We consider the randomized communication complexity of the distributed $ell_p$-regression problem in the coordinator model.
For $p = 2$, i.e., least squares regression, we give the first optimal bound of $tildeTheta(sd2 + sd/epsilon)$ bits.
For $p in (1,2)$,we obtain an $tildeO(sd2/epsilon + sd/mathrmpoly(epsilon)$ upper bound.
arXiv Detail & Related papers (2023-07-11T08:51:53Z) - Fast Online Node Labeling for Very Large Graphs [11.700626862639131]
Current methods either invert a graph kernel runtime matrix with $mathcalO(n3)$ or $mathcalO(n2)$ space complexity or sample a large volume of random spanning trees.
We propose an improvement based on the textitonline relaxation technique introduced by a series of works.
arXiv Detail & Related papers (2023-05-25T17:13:08Z) - Detection of Dense Subhypergraphs by Low-Degree Polynomials [72.4451045270967]
Detection of a planted dense subgraph in a random graph is a fundamental statistical and computational problem.
We consider detecting the presence of a planted $Gr(ngamma, n-alpha)$ subhypergraph in a $Gr(n, n-beta) hypergraph.
Our results are already new in the graph case $r=2$, as we consider the subtle log-density regime where hardness based on average-case reductions is not known.
arXiv Detail & Related papers (2023-04-17T10:38:08Z) - Dynamic Ranking and Translation Synchronization [3.946250592943285]
We study an extension of the emphtranslation synchronization problem, to the dynamic setting.
We propose two estimators -- one based on a smoothness-penalized least squares approach and the other based on projection onto the low frequency eigenspace of a suitable smoothness operator.
arXiv Detail & Related papers (2022-07-04T14:45:12Z) - Supervised Training of Conditional Monge Maps [107.78770597815242]
Optimal transport (OT) theory describes general principles to define and select, among many possible choices, the most efficient way to map a probability measure onto another.
We introduce CondOT, a multi-task approach to estimate a family of OT maps conditioned on a context variable.
We demonstrate the ability of CondOT to infer the effect of an arbitrary combination of genetic or therapeutic perturbations on single cells.
arXiv Detail & Related papers (2022-06-28T19:34:44Z) - Random Graph Matching in Geometric Models: the Case of Complete Graphs [21.689343447798677]
This paper studies the problem of matching two complete graphs with edge weights correlated through latent geometries.
We derive an approximate maximum likelihood estimator, which provably achieves, with high probability, perfect recovery of $pi*$.
As a side discovery, we show that the celebrated spectral algorithm of [Ume88] emerges as a further approximation to the maximum likelihood in the geometric model.
arXiv Detail & Related papers (2022-02-22T04:14:45Z) - AnchorGAE: General Data Clustering via $O(n)$ Bipartite Graph
Convolution [79.44066256794187]
We show how to convert a non-graph dataset into a graph by introducing the generative graph model, which is used to build graph convolution networks (GCNs)
A bipartite graph constructed by anchors is updated dynamically to exploit the high-level information behind data.
We theoretically prove that the simple update will lead to degeneration and a specific strategy is accordingly designed.
arXiv Detail & Related papers (2021-11-12T07:08:13Z) - Testing network correlation efficiently via counting trees [19.165942326142538]
We propose a new procedure for testing whether two networks are edge-correlated through some latent correspondence.
The test statistic is based on counting the co-occurrences of signed trees for a family of non-isomorphic trees.
arXiv Detail & Related papers (2021-10-22T14:47:20Z) - Adversarial Linear Contextual Bandits with Graph-Structured Side
Observations [80.95090605985042]
A learning agent repeatedly chooses from a set of $K$ actions after being presented with a $d$-dimensional context vector.
The agent incurs and observes the loss of the chosen action, but also observes the losses of its neighboring actions in the observation structures.
Two efficient algorithms are developed based on textttEXP3.
arXiv Detail & Related papers (2020-12-10T15:40:07Z) - Few-Shot Learning via Learning the Representation, Provably [115.7367053639605]
This paper studies few-shot learning via representation learning.
One uses $T$ source tasks with $n_1$ data per task to learn a representation in order to reduce the sample complexity of a target task.
arXiv Detail & Related papers (2020-02-21T17:30:00Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.