Related papers: Disentangled Condensation for Large-scale Graphs

Disentangled Condensation for Large-scale Graphs

URL: http://arxiv.org/abs/2401.12231v1
Date: Thu, 18 Jan 2024 09:59:00 GMT
Title: Disentangled Condensation for Large-scale Graphs
Authors: Zhenbang Xiao, Shunyu Liu, Yu Wang, Tongya Zheng, Mingli Song
Abstract summary: This paper presents Disentangled Condensation for large-scale graphs, abbreviated as DisCo, to provide scalable graph condensation for graphs of varying sizes. DisCo can successfully scale up to the ogbn-papers100M graph with over 100 million nodes and 1 billion edges with flexible reduction rates.
Score: 34.0968917267551
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Graph condensation has emerged as an intriguing technique to provide Graph Neural Networks for large-scale graphs with a more compact yet informative small graph to save the expensive costs of large-scale graph learning. Despite the promising results achieved, previous graph condensation methods often employ an entangled condensation strategy that involves condensing nodes and edges simultaneously, leading to substantial GPU memory demands. This entangled strategy has considerably impeded the scalability of graph condensation, impairing its capability to condense extremely large-scale graphs and produce condensed graphs with high fidelity. Therefore, this paper presents Disentangled Condensation for large-scale graphs, abbreviated as DisCo, to provide scalable graph condensation for graphs of varying sizes. At the heart of DisCo are two complementary components, namely node and edge condensation modules, that realize the condensation of nodes and edges in a disentangled manner. In the node condensation module, we focus on synthesizing condensed nodes that exhibit a similar node feature distribution to original nodes using a pre-trained node classification model while incorporating class centroid alignment and anchor attachment regularizers. After node condensation, in the edge condensation module, we preserve the topology structure by transferring the link prediction model of the original graph to the condensed nodes, generating the corresponding condensed edges. Based on the disentangled strategy, the proposed DisCo can successfully scale up to the ogbn-papers100M graph with over 100 million nodes and 1 billion edges with flexible reduction rates. Extensive experiments on five common datasets further demonstrate that the proposed DisCo yields results superior to state-of-the-art counterparts by a significant margin. The source code is available at https://github.com/BangHonor/DisCo.

Related papers

Simple yet Effective Graph Distillation via Clustering [2.6217304977339473]
graph data distillation (GDD) seeks to distill large graphs into compact and informative ones.<n>ClustGDD synthesizes the condensed graph and node attributes through fast and theoretically-grounded clustering.<n>ClustGDD consistently achieve superior or comparable performance to state-of-the-art GDD methods in terms of node classification.
arXiv Detail & Related papers (2025-05-27T07:13:10Z)
Training-free Heterogeneous Graph Condensation via Data Selection [74.06562124781104]
We present the first Training underlineFree Heterogeneous Graph Condensation method, termed FreeHGC, facilitating both efficient and high-quality generation of heterogeneous condensed graphs. Specifically, we reformulate the heterogeneous graph condensation problem as a data selection issue, offering a new perspective for assessing and condensing representative nodes and edges in the heterogeneous graphs.
arXiv Detail & Related papers (2024-12-20T02:49:32Z)
Contrastive Graph Condensation: Advancing Data Versatility through Self-Supervised Learning [47.74244053386216]
Graph condensation is a promising solution to synthesize a compact, substitute graph of the large-scale original graph. We introduce Contrastive Graph Condensation (CTGC), which adopts a self-supervised surrogate task to extract critical, causal information from the original graph. CTGC excels in handling various downstream tasks with a limited number of labels, consistently outperforming state-of-the-art GC methods.
arXiv Detail & Related papers (2024-11-26T03:01:22Z)
Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs) This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings. Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z)
Graph Condensation for Open-World Graph Learning [48.38802327346445]
Graph condensation (GC) has emerged as a promising acceleration solution for efficiently training graph neural networks (GNNs) Existing GC methods are limited to aligning the condensed graph with merely the observed static graph distribution. In real-world scenarios, however, graphs are dynamic and constantly evolving, with new nodes and edges being continually integrated. We propose OpenGC, a robust GC framework that integrates structure-aware distribution shift to simulate evolving graph patterns.
arXiv Detail & Related papers (2024-05-27T09:47:09Z)
Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [56.26113670151363]
Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph. Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources. We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC) CGC achieves state-of-the-art performance with a more efficient condensation process.
arXiv Detail & Related papers (2024-05-22T14:57:09Z)
Simple Graph Condensation [30.85754566420301]
Graph condensation involves tuning Graph Neural Networks (GNNs) on a small condensed graph for use on a large-scale original graph. We introduce the Simple Graph Condensation (SimGC) framework, which aligns the condensed graph with the original graph from the input layer to the prediction layer. SimGC achieves a significant speedup of up to 10 times compared to existing graph condensation methods.
arXiv Detail & Related papers (2024-03-22T05:04:48Z)
Fast Graph Condensation with Structure-based Neural Tangent Kernel [30.098666399404287]
We propose a novel dataset condensation framework (GC-SNTK) for graph-structured data. A Structure-based Neural Tangent Kernel (SNTK) is developed to capture the topology of graph and serves as the kernel function in KRR paradigm. Experiments demonstrate the effectiveness of our proposed model in accelerating graph condensation while maintaining high prediction performance.
arXiv Detail & Related papers (2023-10-17T07:25:59Z)
Graph Condensation for Inductive Node Representation Learning [59.76374128436873]
We propose mapping-aware graph condensation (MCond) MCond integrates new nodes into the synthetic graph for inductive representation learning. On the Reddit dataset, MCond achieves up to 121.5x inference speedup and 55.9x reduction in storage requirements.
arXiv Detail & Related papers (2023-07-29T12:11:14Z)
Structure-free Graph Condensation: From Large-scale Graphs to Condensed Graph-free Data [91.27527985415007]
Existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph. We advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a large-scale graph into a small-scale graph node set.
arXiv Detail & Related papers (2023-06-05T07:53:52Z)
$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks [18.772128348519566]
We propose the Aggregation-Aware mixed-precision Quantization ($rm A2Q$) for Graph Neural Networks (GNNs) Our method can achieve up to 11.4% and 9.5% accuracy improvements on the node-level and graph-level tasks, respectively, and up to 2x speedup on a dedicated hardware accelerator.
arXiv Detail & Related papers (2023-02-01T02:54:35Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.