Graphlets correct for the topological information missed by random walks
- URL: http://arxiv.org/abs/2405.14194v1
- Date: Thu, 23 May 2024 05:42:38 GMT
- Title: Graphlets correct for the topological information missed by random walks
- Authors: Sam F. L. Windels, Noel Malod-Dognin, Natasa Przulj,
- Abstract summary: We introduce orbit adjacencies that quantify the adjacencies of two nodes as co-occurring on a given pair of graphlet orbits.
We prove that random walks on up to k nodes capture only a subset of all the possible orbit adjacencies for up to k-node graphlets.
We find that orbit adjacencies, which include those unseen by random walks, outperform random walk-based adjacencies.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Random walks are widely used for mining networks due to the computational efficiency of computing them. For instance, graph representation learning learns a d-dimensional embedding space, so that the nodes that tend to co-occur on random walks (a proxy of being in the same network neighborhood) are close in the embedding space. Specific local network topology (i.e., structure) influences the co-occurrence of nodes on random walks, so random walks of limited length capture only partial topological information, hence diminishing the performance of downstream methods. We explicitly capture all topological neighborhood information and improve performance by introducing orbit adjacencies that quantify the adjacencies of two nodes as co-occurring on a given pair of graphlet orbits, which are symmetric positions on graphlets (small, connected, non-isomorphic, induced subgraphs of a large network). Importantly, we mathematically prove that random walks on up to k nodes capture only a subset of all the possible orbit adjacencies for up to k-node graphlets. Furthermore, we enable orbit adjacency-based analysis of networks by developing an efficient GRaphlet-orbit ADjacency COunter (GRADCO), which exhaustively computes all 28 orbit adjacency matrices for up to four-node graphlets. Note that four-node graphlets suffice, because real networks are usually small-world. In large networks on around 20,000 nodes, GRADCOcomputesthe28matricesinminutes. Onsixrealnetworksfromvarious domains, we compare the performance of node-label predictors obtained by using the network embeddings based on our orbit adjacencies to those based on random walks. We find that orbit adjacencies, which include those unseen by random walks, outperform random walk-based adjacencies, demonstrating the importance of the inclusion of the topological neighborhood information that is unseen by random walks.
Related papers
- Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data.
The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z) - Node Embedding for Homophilous Graphs with ARGEW: Augmentation of Random
walks by Graph Edge Weights [2.2935273605606494]
ARGEW is a novel augmentation method for random walks that expands the corpus in such a way that nodes with larger edge weights end up with closer embeddings.
With several real-world networks, we demonstrate that with ARGEW, compared to not using it, the desired pattern that node pairs with larger edge weights have closer embeddings is much clearer.
arXiv Detail & Related papers (2023-08-11T06:19:23Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Graph Neural Networks with Feature and Structure Aware Random Walk [7.143879014059894]
We show that in typical heterphilous graphs, the edges may be directed, and whether to treat the edges as is or simply make them undirected greatly affects the performance of the GNN models.
We develop a model that adaptively learns the directionality of the graph, and exploits the underlying long-distance correlations between nodes.
arXiv Detail & Related papers (2021-11-19T08:54:21Z) - Inferential SIR-GN: Scalable Graph Representation Learning [0.4699313647907615]
Graph representation learning methods generate numerical vector representations for the nodes in a network.
In this work, we propose Inferential SIR-GN, a model which is pre-trained on random graphs, then computes node representations rapidly.
We demonstrate that the model is able to capture node's structural role information, and show excellent performance at node and graph classification tasks, on unseen networks.
arXiv Detail & Related papers (2021-11-08T20:56:37Z) - Community detection using low-dimensional network embedding algorithms [1.052782170493037]
We rigorously understand the performance of two major algorithms, DeepWalk and node2vec, in recovering communities for canonical network models.
We prove that, given some fixed co-occurrence window, node2vec using random walks with a low non-backtracking probability can succeed for much sparser networks.
arXiv Detail & Related papers (2021-11-04T14:57:43Z) - Degree-Based Random Walk Approach for Graph Embedding [0.0]
A computationally less intensive and node connectivity aware uniform sampling method is proposed.
The advantages of the proposed algorithm become more enhanced when the algorithm is applied to large graphs.
arXiv Detail & Related papers (2021-10-21T19:16:16Z) - Node Similarity Preserving Graph Convolutional Networks [51.520749924844054]
Graph Neural Networks (GNNs) explore the graph structure and node features by aggregating and transforming information within node neighborhoods.
We propose SimP-GCN that can effectively and efficiently preserve node similarity while exploiting graph structure.
We validate the effectiveness of SimP-GCN on seven benchmark datasets including three assortative and four disassorative graphs.
arXiv Detail & Related papers (2020-11-19T04:18:01Z) - Spectral Embedding of Graph Networks [76.27138343125985]
We introduce an unsupervised graph embedding that trades off local node similarity and connectivity, and global structure.
The embedding is based on a generalized graph Laplacian, whose eigenvectors compactly capture both network structure and neighborhood proximity in a single representation.
arXiv Detail & Related papers (2020-09-30T04:59:10Z) - node2coords: Graph Representation Learning with Wasserstein Barycenters [59.07120857271367]
We introduce node2coords, a representation learning algorithm for graphs.
It learns simultaneously a low-dimensional space and coordinates for the nodes in that space.
Experimental results demonstrate that the representations learned with node2coords are interpretable.
arXiv Detail & Related papers (2020-07-31T13:14:25Z) - Block-Approximated Exponential Random Graphs [77.4792558024487]
An important challenge in the field of exponential random graphs (ERGs) is the fitting of non-trivial ERGs on large graphs.
We propose an approximative framework to such non-trivial ERGs that result in dyadic independence (i.e., edge independent) distributions.
Our methods are scalable to sparse graphs consisting of millions of nodes.
arXiv Detail & Related papers (2020-02-14T11:42:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.