Disentangled Condensation for Large-scale Graphs
- URL: http://arxiv.org/abs/2401.12231v1
- Date: Thu, 18 Jan 2024 09:59:00 GMT
- Title: Disentangled Condensation for Large-scale Graphs
- Authors: Zhenbang Xiao, Shunyu Liu, Yu Wang, Tongya Zheng, Mingli Song
- Abstract summary: This paper presents Disentangled Condensation for large-scale graphs, abbreviated as DisCo, to provide scalable graph condensation for graphs of varying sizes.
DisCo can successfully scale up to the ogbn-papers100M graph with over 100 million nodes and 1 billion edges with flexible reduction rates.
- Score: 34.0968917267551
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Graph condensation has emerged as an intriguing technique to provide Graph
Neural Networks for large-scale graphs with a more compact yet informative
small graph to save the expensive costs of large-scale graph learning. Despite
the promising results achieved, previous graph condensation methods often
employ an entangled condensation strategy that involves condensing nodes and
edges simultaneously, leading to substantial GPU memory demands. This entangled
strategy has considerably impeded the scalability of graph condensation,
impairing its capability to condense extremely large-scale graphs and produce
condensed graphs with high fidelity. Therefore, this paper presents
Disentangled Condensation for large-scale graphs, abbreviated as DisCo, to
provide scalable graph condensation for graphs of varying sizes. At the heart
of DisCo are two complementary components, namely node and edge condensation
modules, that realize the condensation of nodes and edges in a disentangled
manner. In the node condensation module, we focus on synthesizing condensed
nodes that exhibit a similar node feature distribution to original nodes using
a pre-trained node classification model while incorporating class centroid
alignment and anchor attachment regularizers. After node condensation, in the
edge condensation module, we preserve the topology structure by transferring
the link prediction model of the original graph to the condensed nodes,
generating the corresponding condensed edges. Based on the disentangled
strategy, the proposed DisCo can successfully scale up to the ogbn-papers100M
graph with over 100 million nodes and 1 billion edges with flexible reduction
rates. Extensive experiments on five common datasets further demonstrate that
the proposed DisCo yields results superior to state-of-the-art counterparts by
a significant margin. The source code is available at
https://github.com/BangHonor/DisCo.
Related papers
- GC-Bench: An Open and Unified Benchmark for Graph Condensation [54.70801435138878]
We develop a comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation.
GC-Bench systematically investigates the characteristics of graph condensation in terms of the following dimensions: effectiveness, transferability, and complexity.
We have developed an easy-to-use library for training and evaluating different GC methods to facilitate reproducible research.
arXiv Detail & Related papers (2024-06-30T07:47:34Z) - Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [56.26113670151363]
Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph.
Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources.
We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC)
CGC achieves state-of-the-art performance with a more efficient condensation process.
arXiv Detail & Related papers (2024-05-22T14:57:09Z) - Simple Graph Condensation [30.85754566420301]
Graph condensation involves tuning Graph Neural Networks (GNNs) on a small condensed graph for use on a large-scale original graph.
We introduce the Simple Graph Condensation (SimGC) framework, which aligns the condensed graph with the original graph from the input layer to the prediction layer.
SimGC achieves a significant speedup of up to 10 times compared to existing graph condensation methods.
arXiv Detail & Related papers (2024-03-22T05:04:48Z) - Navigating Complexity: Toward Lossless Graph Condensation via Expanding Window Matching [26.303436980548174]
Graph condensation aims to reduce the size of a large-scale graph dataset by synthesizing a compact counterpart.
Existing methods often fall short of accurately replicating the original graph for certain datasets.
In this paper, we make the first attempt toward textitlossless graph condensation by bridging the previously neglected supervision signals.
arXiv Detail & Related papers (2024-02-07T16:32:02Z) - Two Trades is not Baffled: Condensing Graph via Crafting Rational Gradient Matching [50.30124426442228]
Training on large-scale graphs has achieved remarkable results in graph representation learning, but its cost and storage have raised growing concerns.
We propose a novel graph method named textbfCraftextbfTing textbfRationatextbf (textbfCTRL) which offers an optimized starting point closer to the original dataset's feature distribution.
arXiv Detail & Related papers (2024-02-07T14:49:10Z) - Graph Condensation for Inductive Node Representation Learning [59.76374128436873]
We propose mapping-aware graph condensation (MCond)
MCond integrates new nodes into the synthetic graph for inductive representation learning.
On the Reddit dataset, MCond achieves up to 121.5x inference speedup and 55.9x reduction in storage requirements.
arXiv Detail & Related papers (2023-07-29T12:11:14Z) - Structure-free Graph Condensation: From Large-scale Graphs to Condensed
Graph-free Data [91.27527985415007]
Existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph.
We advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a large-scale graph into a small-scale graph node set.
arXiv Detail & Related papers (2023-06-05T07:53:52Z) - Edge but not Least: Cross-View Graph Pooling [76.71497833616024]
This paper presents a cross-view graph pooling (Co-Pooling) method to better exploit crucial graph structure information.
Through cross-view interaction, edge-view pooling and node-view pooling seamlessly reinforce each other to learn more informative graph-level representations.
arXiv Detail & Related papers (2021-09-24T08:01:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.