Related papers: Scalable Deep Graph Clustering with Random-walk based Self-supervised Learning

Scalable Deep Graph Clustering with Random-walk based Self-supervised Learning

URL: http://arxiv.org/abs/2112.15530v1
Date: Fri, 31 Dec 2021 16:12:23 GMT
Title: Scalable Deep Graph Clustering with Random-walk based Self-supervised Learning
Authors: Xiang Li (1), Dong Li (2), Ruoming Jin (2), Gagan Agrawal (3), Rajiv Ramnath (4) ((1) Ohio State University, (2) Kent State University, (3) Augusta University)
Abstract summary: We show that our scalable deep clustering algorithm, RwSL, can continue to scale beyond graphs with more than one million nodes. We also demonstrate how RwSL could perform node clustering on a graph with 1.8 billion edges using only a single GPU.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Web-based interactions can be frequently represented by an attributed graph, and node clustering in such graphs has received much attention lately. Multiple efforts have successfully applied Graph Convolutional Networks (GCN), though with some limits on accuracy as GCNs have been shown to suffer from over-smoothing issues. Though other methods (particularly those based on Laplacian Smoothing) have reported better accuracy, a fundamental limitation of all the work is a lack of scalability. This paper addresses this open problem by relating the Laplacian smoothing to the Generalized PageRank and applying a random-walk based algorithm as a scalable graph filter. This forms the basis for our scalable deep clustering algorithm, RwSL, where through a self-supervised mini-batch training mechanism, we simultaneously optimize a deep neural network for sample-cluster assignment distribution and an autoencoder for a clustering-oriented embedding. Using 6 real-world datasets and 6 clustering metrics, we show that RwSL achieved improved results over several recent baselines. Most notably, we show that RwSL, unlike all other deep clustering frameworks, can continue to scale beyond graphs with more than one million nodes, i.e., handle web-scale. We also demonstrate how RwSL could perform node clustering on a graph with 1.8 billion edges using only a single GPU.

Related papers

Deep Cut-informed Graph Embedding and Clustering [36.17182061654739]
We propose an innovative and non-GNN-based Deep Cut-informed Graph embedding and Clustering framework, namely DCGC. For the encoding module, we derive a cut-informed graph embedding objective to fuse graph structure and attributes by minimizing their joint normalized cut. For the clustering module, we utilize the optimal transport theory to obtain the clustering assignments.
arXiv Detail & Related papers (2025-03-09T14:24:09Z)
FedGT: Federated Node Classification with Scalable Graph Transformer [27.50698154862779]
We propose a scalable textbfFederated textbfGraph textbfTransformer (textbfFedGT) in the paper. FedGT computes clients' similarity based on the aligned global nodes with optimal transport.
arXiv Detail & Related papers (2024-01-26T21:02:36Z)
Reinforcement Graph Clustering with Unknown Cluster Number [91.4861135742095]
We propose a new deep graph clustering method termed Reinforcement Graph Clustering. In our proposed method, cluster number determination and unsupervised representation learning are unified into a uniform framework. In order to conduct feedback actions, the clustering-oriented reward function is proposed to enhance the cohesion of the same clusters and separate the different clusters.
arXiv Detail & Related papers (2023-08-13T18:12:28Z)
Dink-Net: Neural Clustering on Large Graphs [59.10189693120368]
A deep graph clustering method (Dink-Net) is proposed with the idea of dilation and shrink. By discriminating nodes, whether being corrupted by augmentations, representations are learned in a self-supervised manner. The clustering distribution is optimized by minimizing the proposed cluster dilation loss and cluster shrink loss. Compared to the runner-up, Dink-Net 9.62% achieves NMI improvement on the ogbn-papers100M dataset with 111 million nodes and 1.6 billion edges.
arXiv Detail & Related papers (2023-05-28T15:33:24Z)
EGRC-Net: Embedding-induced Graph Refinement Clustering Network [66.44293190793294]
We propose a novel graph clustering network called Embedding-Induced Graph Refinement Clustering Network (EGRC-Net) EGRC-Net effectively utilizes the learned embedding to adaptively refine the initial graph and enhance the clustering performance. Our proposed methods consistently outperform several state-of-the-art approaches.
arXiv Detail & Related papers (2022-11-19T09:08:43Z)
DOTIN: Dropping Task-Irrelevant Nodes for GNNs [119.17997089267124]
Recent graph learning approaches have introduced the pooling strategy to reduce the size of graphs for learning. We design a new approach called DOTIN (underlineDrunderlineopping underlineTask-underlineIrrelevant underlineNodes) to reduce the size of graphs. Our method speeds up GAT by about 50% on graph-level tasks including graph classification and graph edit distance.
arXiv Detail & Related papers (2022-04-28T12:00:39Z)
SCGC : Self-Supervised Contrastive Graph Clustering [1.1470070927586016]
Graph clustering discovers groups or communities within networks. Deep learning methods such as autoencoders cannot incorporate rich structural information. We propose Self-Supervised Contrastive Graph Clustering (SCGC)
arXiv Detail & Related papers (2022-04-27T01:38:46Z)
ClusterGNN: Cluster-based Coarse-to-Fine Graph Neural Network for Efficient Feature Matching [15.620335576962475]
ClusterGNN is an attentional GNN architecture which operates on clusters for learning the feature matching task. Our approach yields a 59.7% reduction in runtime and 58.4% reduction in memory consumption for dense detection.
arXiv Detail & Related papers (2022-04-25T14:43:15Z)
Git: Clustering Based on Graph of Intensity Topology [25.679620842010422]
We propose a novel clustering algorithm, namely GIT (Clustering Based on textbfGraph of textbfIntensity textbfTopology) With fast local cluster detection, robust topo-graph construction and accurate edge-cutting, GIT shows attractive ARISE performance.
arXiv Detail & Related papers (2021-10-04T09:29:43Z)
Learning Hierarchical Graph Neural Networks for Image Clustering [81.5841862489509]
We propose a hierarchical graph neural network (GNN) model that learns how to cluster a set of images into an unknown number of identities. Our hierarchical GNN uses a novel approach to merge connected components predicted at each level of the hierarchy to form a new graph at the next level.
arXiv Detail & Related papers (2021-07-03T01:28:42Z)
Scaling Graph Neural Networks with Approximate PageRank [64.92311737049054]
We present the PPRGo model which utilizes an efficient approximation of information diffusion in GNNs. In addition to being faster, PPRGo is inherently scalable, and can be trivially parallelized for large datasets like those found in industry settings. We show that training PPRGo and predicting labels for all nodes in this graph takes under 2 minutes on a single machine, far outpacing other baselines on the same graph.
arXiv Detail & Related papers (2020-07-03T09:30:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.