Community detection using low-dimensional network embedding algorithms
- URL: http://arxiv.org/abs/2111.05267v1
- Date: Thu, 4 Nov 2021 14:57:43 GMT
- Title: Community detection using low-dimensional network embedding algorithms
- Authors: Aman Barot and Shankar Bhamidi and Souvik Dhara
- Abstract summary: We rigorously understand the performance of two major algorithms, DeepWalk and node2vec, in recovering communities for canonical network models.
We prove that, given some fixed co-occurrence window, node2vec using random walks with a low non-backtracking probability can succeed for much sparser networks.
- Score: 1.052782170493037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing relevance of large networks in important areas such as
the study of contact networks for spread of disease, or social networks for
their impact on geopolitics, it has become necessary to study machine learning
tools that are scalable to very large networks, often containing millions of
nodes. One major class of such scalable algorithms is known as network
representation learning or network embedding. These algorithms try to learn
representations of network functionals (e.g.~nodes) by first running multiple
random walks and then using the number of co-occurrences of each pair of nodes
in observed random walk segments to obtain a low-dimensional representation of
nodes on some Euclidean space. The aim of this paper is to rigorously
understand the performance of two major algorithms, DeepWalk and node2vec, in
recovering communities for canonical network models with ground truth
communities. Depending on the sparsity of the graph, we find the length of the
random walk segments required such that the corresponding observed
co-occurrence window is able to perform almost exact recovery of the underlying
community assignments. We prove that, given some fixed co-occurrence window,
node2vec using random walks with a low non-backtracking probability can succeed
for much sparser networks compared to DeepWalk using simple random walks.
Moreover, if the sparsity parameter is low, we provide evidence that these
algorithms might not succeed in almost exact recovery. The analysis requires
developing general tools for path counting on random networks having an
underlying low-rank structure, which are of independent interest.
Related papers
- Strong and Weak Random Walks on Signed Networks [4.739812980667592]
We propose a signed network random walk which can capture the structure of a network with more than two communities.
The walk results in a similarity matrix which can be used to cluster the nodes into antagonistic communities.
We show through a series of experiments on synthetic and empirical networks that the similarity matrix based on weak walks can be used for both unsupervised and semi-trivial clustering.
arXiv Detail & Related papers (2024-06-12T09:36:20Z) - Sifting out communities in large sparse networks [2.666294200266662]
We introduce an intuitive objective function for quantifying the quality of clustering results in large sparse networks.
We utilize a two-step method for identifying communities which is especially well-suited for this domain.
We identify complex genetic interactions in large-scale networks comprised of tens of thousands of nodes.
arXiv Detail & Related papers (2024-05-01T18:57:41Z) - Learning to Detect Critical Nodes in Sparse Graphs via Feature Importance Awareness [53.351863569314794]
The critical node problem (CNP) aims to find a set of critical nodes from a network whose deletion maximally degrades the pairwise connectivity of the residual network.
This work proposes a feature importance-aware graph attention network for node representation.
It combines it with dueling double deep Q-network to create an end-to-end algorithm to solve CNP for the first time.
arXiv Detail & Related papers (2021-12-03T14:23:05Z) - Unsupervised Domain-adaptive Hash for Networks [81.49184987430333]
Domain-adaptive hash learning has enjoyed considerable success in the computer vision community.
We develop an unsupervised domain-adaptive hash learning method for networks, dubbed UDAH.
arXiv Detail & Related papers (2021-08-20T12:09:38Z) - Artificial Neural Networks generated by Low Discrepancy Sequences [59.51653996175648]
We generate artificial neural networks as random walks on a dense network graph.
Such networks can be trained sparse from scratch, avoiding the expensive procedure of training a dense network and compressing it afterwards.
We demonstrate that the artificial neural networks generated by low discrepancy sequences can achieve an accuracy within reach of their dense counterparts at a much lower computational complexity.
arXiv Detail & Related papers (2021-03-05T08:45:43Z) - Consistency of random-walk based network embedding algorithms [13.214230533788932]
We study the node2vec and DeepWalk algorithms through the perspective of matrix factorization.
Our results indicate a subtle interplay between the sparsity of the observed networks, the window sizes of the random walks, and the convergence rates of the node2vec/DeepWalk embedding.
arXiv Detail & Related papers (2021-01-18T22:49:22Z) - Inductive Graph Embeddings through Locality Encodings [0.42970700836450487]
We look at the problem of finding inductive network embeddings in large networks without domain-dependent node/edge attributes.
We propose to use a set of basic predefined local encodings as the basis of a learning algorithm.
This method achieves state-of-the-art performance in tasks such as role detection, link prediction and node classification.
arXiv Detail & Related papers (2020-09-26T13:09:11Z) - Sketch-based community detection in evolving networks [15.512086254435788]
We consider an approach for community detection in time-varying networks.
At its core, this approach maintains a small sketch graph to capture the essential community structure found in each snapshot of the full network.
We formulate a community detection algorithm which can process a network concurrently exhibiting all processes.
arXiv Detail & Related papers (2020-09-24T17:32:57Z) - ESPN: Extremely Sparse Pruned Networks [50.436905934791035]
We show that a simple iterative mask discovery method can achieve state-of-the-art compression of very deep networks.
Our algorithm represents a hybrid approach between single shot network pruning methods and Lottery-Ticket type approaches.
arXiv Detail & Related papers (2020-06-28T23:09:27Z) - Graph Prototypical Networks for Few-shot Learning on Attributed Networks [72.31180045017835]
We propose a graph meta-learning framework -- Graph Prototypical Networks (GPN)
GPN is able to perform textitmeta-learning on an attributed network and derive a highly generalizable model for handling the target classification task.
arXiv Detail & Related papers (2020-06-23T04:13:23Z) - Detecting Communities in Heterogeneous Multi-Relational Networks:A
Message Passing based Approach [89.19237792558687]
Community is a common characteristic of networks including social networks, biological networks, computer and information networks.
We propose an efficient message passing based algorithm to simultaneously detect communities for all homogeneous networks.
arXiv Detail & Related papers (2020-04-06T17:36:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.