Training-free Heterogeneous Graph Condensation via Data Selection
- URL: http://arxiv.org/abs/2412.16250v1
- Date: Fri, 20 Dec 2024 02:49:32 GMT
- Title: Training-free Heterogeneous Graph Condensation via Data Selection
- Authors: Yuxuan Liang, Wentao Zhang, Xinyi Gao, Ling Yang, Chong Chen, Hongzhi Yin, Yunhai Tong, Bin Cui,
- Abstract summary: We present the first Training underlineFree Heterogeneous Graph Condensation method, termed FreeHGC, facilitating both efficient and high-quality generation of heterogeneous condensed graphs.
Specifically, we reformulate the heterogeneous graph condensation problem as a data selection issue, offering a new perspective for assessing and condensing representative nodes and edges in the heterogeneous graphs.
- Score: 74.06562124781104
- License:
- Abstract: Efficient training of large-scale heterogeneous graphs is of paramount importance in real-world applications. However, existing approaches typically explore simplified models to mitigate resource and time overhead, neglecting the crucial aspect of simplifying large-scale heterogeneous graphs from the data-centric perspective. Addressing this gap, HGCond introduces graph condensation (GC) in heterogeneous graphs and generates a small condensed graph for efficient model training. Despite its efficacy in graph generation, HGCond encounters two significant limitations. The first is low effectiveness, HGCond excessively relies on the simplest relay model for the condensation procedure, which restricts the ability to exert powerful Heterogeneous Graph Neural Networks (HGNNs) with flexible condensation ratio and limits the generalization ability. The second is low efficiency, HGCond follows the existing GC methods designed for homogeneous graphs and leverages the sophisticated optimization paradigm, resulting in a time-consuming condensing procedure. In light of these challenges, we present the first Training \underline{Free} Heterogeneous Graph Condensation method, termed FreeHGC, facilitating both efficient and high-quality generation of heterogeneous condensed graphs. Specifically, we reformulate the heterogeneous graph condensation problem as a data selection issue, offering a new perspective for assessing and condensing representative nodes and edges in the heterogeneous graphs. By leveraging rich meta-paths, we introduce a new, high-quality heterogeneous data selection criterion to select target-type nodes. Furthermore, two training-free condensation strategies for heterogeneous graphs are designed to condense and synthesize other-types nodes effectively.
Related papers
- Rethinking and Accelerating Graph Condensation: A Training-Free Approach with Class Partition [49.41718583061147]
Graph condensation is a data-centric solution to replace the large graph with a small yet informative condensed graph.
Existing GC methods suffer from intricate optimization processes, necessitating excessive computing resources and training time.
We propose a training-free GC framework termed Class-partitioned Graph Condensation (CGC)
CGC condenses the Ogbn-products graph within 30 seconds, achieving a speedup ranging from $102$X to $104$X and increasing accuracy by up to 4.2%.
arXiv Detail & Related papers (2024-05-22T14:57:09Z) - Simple Graph Condensation [30.85754566420301]
Graph condensation involves tuning Graph Neural Networks (GNNs) on a small condensed graph for use on a large-scale original graph.
We introduce the Simple Graph Condensation (SimGC) framework, which aligns the condensed graph with the original graph from the input layer to the prediction layer.
SimGC achieves a significant speedup of up to 10 times compared to existing graph condensation methods.
arXiv Detail & Related papers (2024-03-22T05:04:48Z) - Disentangled Condensation for Large-scale Graphs [29.384060761810172]
Graph condensation has emerged as an intriguing technique to save the expensive training costs of Graph Neural Networks (GNNs)
We propose to disentangle the condensation process into a two-stage GNN-free paradigm, independently condensing nodes and generating edges.
This simple yet effective approach achieves at least 10 times faster than state-of-the-art methods with comparable accuracy on medium-scale graphs.
arXiv Detail & Related papers (2024-01-18T09:59:00Z) - Structure-free Graph Condensation: From Large-scale Graphs to Condensed
Graph-free Data [91.27527985415007]
Existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph.
We advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a large-scale graph into a small-scale graph node set.
arXiv Detail & Related papers (2023-06-05T07:53:52Z) - Single-Pass Contrastive Learning Can Work for Both Homophilic and
Heterophilic Graph [60.28340453547902]
Graph contrastive learning (GCL) techniques typically require two forward passes for a single instance to construct the contrastive loss.
Existing GCL approaches fail to provide strong performance guarantees.
We implement the Single-Pass Graph Contrastive Learning method (SP-GCL)
Empirically, the features learned by the SP-GCL can match or outperform existing strong baselines with significantly less computational overhead.
arXiv Detail & Related papers (2022-11-20T07:18:56Z) - Heterogeneous Graph Neural Networks using Self-supervised Reciprocally
Contrastive Learning [102.9138736545956]
Heterogeneous graph neural network (HGNN) is a very popular technique for the modeling and analysis of heterogeneous graphs.
We develop for the first time a novel and robust heterogeneous graph contrastive learning approach, namely HGCL, which introduces two views on respective guidance of node attributes and graph topologies.
In this new approach, we adopt distinct but most suitable attribute and topology fusion mechanisms in the two views, which are conducive to mining relevant information in attributes and topologies separately.
arXiv Detail & Related papers (2022-04-30T12:57:02Z) - Multilayer Graph Clustering with Optimized Node Embedding [70.1053472751897]
multilayer graph clustering aims at dividing the graph nodes into categories or communities.
We propose a clustering-friendly embedding of the layers of a given multilayer graph.
Experiments show that our method leads to a significant improvement.
arXiv Detail & Related papers (2021-03-30T17:36:40Z) - Hybrid Micro/Macro Level Convolution for Heterogeneous Graph Learning [45.14314180743549]
Heterogeneous graphs are pervasive in practical scenarios, where each graph consists of multiple types of nodes and edges.
Most of the existing graph convolution approaches were designed for homogeneous graphs, and therefore cannot handle heterogeneous graphs.
We propose HGConv, a novel Heterogeneous Graph Convolution approach, to learn comprehensive node representations on heterogeneous graphs.
arXiv Detail & Related papers (2020-12-29T12:12:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.