Graph-Skeleton: ~1% Nodes are Sufficient to Represent Billion-Scale
Graph
- URL: http://arxiv.org/abs/2402.09565v2
- Date: Wed, 6 Mar 2024 22:22:33 GMT
- Title: Graph-Skeleton: ~1% Nodes are Sufficient to Represent Billion-Scale
Graph
- Authors: Linfeng Cao, Haoran Deng, Yang Yang, Chunping Wang, Lei Chen
- Abstract summary: We propose a novel Graph-Skeleton1 model, which properly fetches the background nodes, and further condenses the semantic and topological information of background nodes within similar target-background local structures.
In particular, for MAG240M dataset with 0.24 billion nodes, our generated skeleton graph achieves highly comparable performance while only containing 1.8% nodes of the original graph.
- Score: 11.661431088477329
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Due to the ubiquity of graph data on the web, web graph mining has become a
hot research spot. Nonetheless, the prevalence of large-scale web graphs in
real applications poses significant challenges to storage, computational
capacity and graph model design. Despite numerous studies to enhance the
scalability of graph models, a noticeable gap remains between academic research
and practical web graph mining applications. One major cause is that in most
industrial scenarios, only a small part of nodes in a web graph are actually
required to be analyzed, where we term these nodes as target nodes, while
others as background nodes. In this paper, we argue that properly fetching and
condensing the background nodes from massive web graph data might be a more
economical shortcut to tackle the obstacles fundamentally. To this end, we make
the first attempt to study the problem of massive background nodes compression
for target nodes classification. Through extensive experiments, we reveal two
critical roles played by the background nodes in target node classification:
enhancing structural connectivity between target nodes, and feature correlation
with target nodes. Followingthis, we propose a novel Graph-Skeleton1 model,
which properly fetches the background nodes, and further condenses the semantic
and topological information of background nodes within similar
target-background local structures. Extensive experiments on various web graph
datasets demonstrate the effectiveness and efficiency of the proposed method.
In particular, for MAG240M dataset with 0.24 billion nodes, our generated
skeleton graph achieves highly comparable performance while only containing
1.8% nodes of the original graph.
Related papers
- Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Saliency-Aware Regularized Graph Neural Network [39.82009838086267]
We propose the Saliency-Aware Regularized Graph Neural Network (SAR-GNN) for graph classification.
We first estimate the global node saliency by measuring the semantic similarity between the compact graph representation and node features.
Then the learned saliency distribution is leveraged to regularize the neighborhood aggregation of the backbone.
arXiv Detail & Related papers (2024-01-01T13:44:16Z) - GraphRARE: Reinforcement Learning Enhanced Graph Neural Network with Relative Entropy [21.553180564868306]
GraphRARE is a framework built upon node relative entropy and deep reinforcement learning.
An innovative node relative entropy is used to measure mutual information between node pairs.
A deep reinforcement learning-based algorithm is developed to optimize the graph topology.
arXiv Detail & Related papers (2023-12-15T11:30:18Z) - Finding MNEMON: Reviving Memories of Node Embeddings [39.206574462957136]
We show that an adversary can recover edges with decent accuracy by only gaining access to the node embedding matrix of the original graph.
We demonstrate the effectiveness and applicability of our graph recovery attack through extensive experiments.
arXiv Detail & Related papers (2022-04-14T13:44:26Z) - Edge but not Least: Cross-View Graph Pooling [76.71497833616024]
This paper presents a cross-view graph pooling (Co-Pooling) method to better exploit crucial graph structure information.
Through cross-view interaction, edge-view pooling and node-view pooling seamlessly reinforce each other to learn more informative graph-level representations.
arXiv Detail & Related papers (2021-09-24T08:01:23Z) - Reasoning Graph Networks for Kinship Verification: from Star-shaped to
Hierarchical [85.0376670244522]
We investigate the problem of facial kinship verification by learning hierarchical reasoning graph networks.
We develop a Star-shaped Reasoning Graph Network (S-RGN) to exploit more powerful and flexible capacity.
We also develop a Hierarchical Reasoning Graph Network (H-RGN) to exploit more powerful and flexible capacity.
arXiv Detail & Related papers (2021-09-06T03:16:56Z) - GraphMI: Extracting Private Graph Data from Graph Neural Networks [59.05178231559796]
We present textbfGraph textbfModel textbfInversion attack (GraphMI), which aims to extract private graph data of the training graph by inverting GNN.
Specifically, we propose a projected gradient module to tackle the discreteness of graph edges while preserving the sparsity and smoothness of graph features.
We design a graph auto-encoder module to efficiently exploit graph topology, node attributes, and target model parameters for edge inference.
arXiv Detail & Related papers (2021-06-05T07:07:52Z) - Graph Entropy Guided Node Embedding Dimension Selection for Graph Neural
Networks [74.26734952400925]
We propose a novel Minimum Graph Entropy (MinGE) algorithm for Node Embedding Dimension Selection (NEDS)
MinGE considers both feature entropy and structure entropy on graphs, which are carefully designed according to the characteristics of the rich information in them.
Experiments with popular Graph Neural Networks (GNNs) on benchmark datasets demonstrate the effectiveness and generalizability of our proposed MinGE.
arXiv Detail & Related papers (2021-05-07T11:40:29Z) - Co-embedding of Nodes and Edges with Graph Neural Networks [13.020745622327894]
Graph embedding is a way to transform and encode the data structure in high dimensional and non-Euclidean feature space.
CensNet is a general graph embedding framework, which embeds both nodes and edges to a latent feature space.
Our approach achieves or matches the state-of-the-art performance in four graph learning tasks.
arXiv Detail & Related papers (2020-10-25T22:39:31Z) - Multilevel Graph Matching Networks for Deep Graph Similarity Learning [79.3213351477689]
We propose a multi-level graph matching network (MGMN) framework for computing the graph similarity between any pair of graph-structured objects.
To compensate for the lack of standard benchmark datasets, we have created and collected a set of datasets for both the graph-graph classification and graph-graph regression tasks.
Comprehensive experiments demonstrate that MGMN consistently outperforms state-of-the-art baseline models on both the graph-graph classification and graph-graph regression tasks.
arXiv Detail & Related papers (2020-07-08T19:48:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.