Related papers: Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition

Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition

URL: http://arxiv.org/abs/2405.13707v2
Date: Thu, 23 Jan 2025 08:49:04 GMT
Title: Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition
Authors: Xinyi Gao, Guanhua Ye, Tong Chen, Wentao Zhang, Junliang Yu, Hongzhi Yin,
Abstract summary: Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph.<n>Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources and training time.<n>We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC)<n>CGC condenses the Ogbn-products graph within 30 seconds, achieving a speedup ranging from $102$X to $104$X and increasing accuracy by up to 4.2%.
Score: 49.41718583061147
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The increasing prevalence of large-scale graphs poses a significant challenge for graph neural network training, attributed to their substantial computational requirements. In response, graph condensation (GC) emerges as a promising data-centric solution aiming to substitute the large graph with a small yet informative condensed graph to facilitate data-efficient GNN training. However, existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources and training time. In this paper, we revisit existing GC optimization strategies and identify two pervasive issues therein: (1) various GC optimization strategies converge to coarse-grained class-level node feature matching between the original and condensed graphs; (2) existing GC methods rely on a Siamese graph network architecture that requires time-consuming bi-level optimization with iterative gradient computations. To overcome these issues, we propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC), which refines the node distribution matching from the class-to-class paradigm into a novel class-to-node paradigm, transforming the GC optimization into a class partition problem which can be efficiently solved by any clustering methods. Moreover, CGC incorporates a pre-defined graph structure to enable a closed-form solution for condensed node features, eliminating the need for back-and-forth gradient descent in existing GC approaches. Extensive experiments demonstrate that CGC achieves an exceedingly efficient condensation process with advanced accuracy. Compared with the state-of-the-art GC methods, CGC condenses the Ogbn-products graph within 30 seconds, achieving a speedup ranging from $10^2$X to $10^4$X and increasing accuracy by up to 4.2%.

Related papers

Efficient Graph Condensation via Gaussian Process [11.304327316816561]
Graph condensation reduces the size of large graphs while preserving performance. Existing methods often rely on bi-level optimization, requiring extensive GNN training and limiting their scalability. This paper proposes Graph Condensation via Gaussian Process (GCGP), a novel and computationally efficient approach to graph condensation.
arXiv Detail & Related papers (2025-01-05T14:43:07Z)
A Greedy Strategy for Graph Cut [95.2841574410968]
We propose a greedy strategy to solve the problem of Graph Cut, called GGC. It starts from the state where each data sample is regarded as a cluster and dynamically merges the two clusters. GGC has a nearly linear computational complexity with respect to the number of samples.
arXiv Detail & Related papers (2024-12-28T05:49:42Z)
Training-free Heterogeneous Graph Condensation via Data Selection [74.06562124781104]
We present the first Training underlineFree Heterogeneous Graph Condensation method, termed FreeHGC, facilitating both efficient and high-quality generation of heterogeneous condensed graphs. Specifically, we reformulate the heterogeneous graph condensation problem as a data selection issue, offering a new perspective for assessing and condensing representative nodes and edges in the heterogeneous graphs.
arXiv Detail & Related papers (2024-12-20T02:49:32Z)
Contrastive Graph Condensation: Advancing Data Versatility through Self-Supervised Learning [47.74244053386216]
Graph condensation is a promising solution to synthesize a compact, substitute graph of the large-scale original graph. We introduce Contrastive Graph Condensation (CTGC), which adopts a self-supervised surrogate task to extract critical, causal information from the original graph. CTGC excels in handling various downstream tasks with a limited number of labels, consistently outperforming state-of-the-art GC methods.
arXiv Detail & Related papers (2024-11-26T03:01:22Z)
GC-Bench: An Open and Unified Benchmark for Graph Condensation [54.70801435138878]
We develop a comprehensive Graph Condensation Benchmark (GC-Bench) to analyze the performance of graph condensation. GC-Bench systematically investigates the characteristics of graph condensation in terms of the following dimensions: effectiveness, transferability, and complexity. We have developed an easy-to-use library for training and evaluating different GC methods to facilitate reproducible research.
arXiv Detail & Related papers (2024-06-30T07:47:34Z)
GC4NC: A Benchmark Framework for Graph Condensation on Node Classification with New Insights [30.796414860754837]
Graph condensation (GC) is an emerging technique designed to learn a significantly smaller graph that retains the essential information of the original graph. This paper introduces textbfGC4NC, a comprehensive framework for evaluating diverse GC methods on node classification. Our systematic evaluation offers novel insights into how condensed graphs behave and the critical design choices that drive their success.
arXiv Detail & Related papers (2024-06-24T15:17:49Z)
RobGC: Towards Robust Graph Condensation [61.259453496191696]
Graph neural networks (GNNs) have attracted widespread attention for their impressive capability of graph representation learning. However, the increasing prevalence of large-scale graphs presents a significant challenge for GNN training due to their computational demands. We propose graph condensation (GC) to generate an informative compact graph that enables efficient training of GNNs while retaining performance.
arXiv Detail & Related papers (2024-06-19T04:14:57Z)
GCondenser: Benchmarking Graph Condensation [26.458605619132385]
This paper proposes the first large-scale graph condensation benchmark, GCondenser, to holistically evaluate and compare mainstream GC methods. GCondenser includes a standardised GC paradigm, consisting of condensation, validation, and evaluation procedures, as well as enabling extensions to new GC methods and datasets.
arXiv Detail & Related papers (2024-05-23T07:25:31Z)
Graph Condensation: A Survey [49.41718583061147]
The rapid growth of graph data poses significant challenges in storage, transmission, and particularly the training of graph neural networks (GNNs) To address these challenges, graph condensation (GC) has emerged as an innovative solution. GC focuses on a compact yet highly representative graph, enabling GNNs trained on it to achieve performance comparable to those trained on the original large graph.
arXiv Detail & Related papers (2024-01-22T06:47:00Z)
Disentangled Condensation for Large-scale Graphs [31.781721873508978]
Graph condensation has emerged as an intriguing technique to save the expensive training costs of Graph Neural Networks (GNNs) We propose to disentangle the condensation process into a two-stage GNN-free paradigm, independently condensing nodes and generating edges. This simple yet effective approach achieves at least 10 times faster than state-of-the-art methods with comparable accuracy on medium-scale graphs.
arXiv Detail & Related papers (2024-01-18T09:59:00Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.