Related papers: Bonsai: Gradient-free Graph Distillation for Node Classification

Bonsai: Gradient-free Graph Distillation for Node Classification

URL: http://arxiv.org/abs/2410.17579v2
Date: Thu, 24 Oct 2024 05:24:53 GMT
Title: Bonsai: Gradient-free Graph Distillation for Node Classification
Authors: Mridul Gupta, Samyak Jain, Vansh Ramani, Hariprasad Kodamana, Sayan Ranu,
Abstract summary: Graph distillation has emerged as a promising avenue to enable scalable training of GNNs. Our study uncovers significant shortcomings in current graph distillation techniques. We present Bonsai, a novel graph distillation method empowered by the observation that textitcomputation trees form the fundamental processing units of message-passing GNNs.
Score: 16.96628744692792
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Graph distillation has emerged as a promising avenue to enable scalable training of GNNs by compressing the training dataset while preserving essential graph characteristics. Our study uncovers significant shortcomings in current graph distillation techniques. First, the majority of the algorithms paradoxically require training on the full dataset to perform distillation. Second, due to their gradient-emulating approach, these methods require fresh distillation for any change in hyperparameters or GNN architecture, limiting their flexibility and reusability. Finally, they fail to achieve substantial size reduction due to synthesizing fully-connected, edge-weighted graphs. To address these challenges, we present Bonsai, a novel graph distillation method empowered by the observation that \textit{computation trees} form the fundamental processing units of message-passing GNNs. Bonsai distills datasets by encoding a careful selection of \textit{exemplar} trees that maximize the representation of all computation trees in the training set. This unique approach imparts Bonsai as the first linear-time, model-agnostic graph distillation algorithm for node classification that outperforms existing baselines across $6$ real-world datasets on accuracy, while being $22$ times faster on average. Bonsai is grounded in rigorous mathematical guarantees on the adopted approximation strategies making it robust to GNN architectures, datasets, and parameters.

Related papers

Scalable Graph Condensation with Evolving Capabilities [23.957382415342735]
Graph Condensation (GC) methods have been proposed to learn a small graph from a larger one, accelerating downstream tasks.<n>Existing approaches critically assume a static training set, which conflicts with the inherently dynamic and evolving nature of real-world graph data.<n>This work introduces a novel framework for continual graph condensation, enabling efficient updates to the distilled graph without requiring costly retraining.
arXiv Detail & Related papers (2025-02-24T20:01:57Z)
Virtual Node Generation for Node Classification in Sparsely-Labeled Graphs [2.0060301665996016]
This paper presents a novel node generation method that infuses a small set of high-quality synthesized nodes into the graph as additional labeled nodes. It is compatible with most popular graph pre-training (self-supervised learning), semi-supervised learning, and meta-learning methods. Our Experiments demonstrate statistically significant performance improvements over 14 baselines on 10 publicly available datasets.
arXiv Detail & Related papers (2024-09-12T02:36:44Z)
GSTAM: Efficient Graph Distillation with Structural Attention-Matching [13.673737442696154]
We introduce Graph Distillation with Structural Attention Matching ( GSTAM), a novel method for condensing graph classification datasets. GSTAM leverages the attention maps of GNNs to distill structural information from the original dataset into synthetic graphs. Comprehensive experiments demonstrate GSTAM's superiority over existing methods, achieving 0.45% to 6.5% better performance in extreme condensation ratios.
arXiv Detail & Related papers (2024-08-29T19:40:04Z)
Graph Data Condensation via Self-expressive Graph Structure Reconstruction [7.4525875528900665]
We introduce a novel framework named textbfGraph Data textbfCondensation via textbfSelf-expressive Graph Structure textbfReconstruction. Our method explicitly incorporates the original graph structure into the condensing process and captures the nuanced interdependencies between the condensed nodes.
arXiv Detail & Related papers (2024-03-12T03:54:25Z)
Disentangled Condensation for Large-scale Graphs [29.384060761810172]
Graph condensation has emerged as an intriguing technique to save the expensive training costs of Graph Neural Networks (GNNs) We propose to disentangle the condensation process into a two-stage GNN-free paradigm, independently condensing nodes and generating edges. This simple yet effective approach achieves at least 10 times faster than state-of-the-art methods with comparable accuracy on medium-scale graphs.
arXiv Detail & Related papers (2024-01-18T09:59:00Z)
Deep Manifold Graph Auto-Encoder for Attributed Graph Embedding [51.75091298017941]
This paper proposes a novel Deep Manifold (Variational) Graph Auto-Encoder (DMVGAE/DMGAE) for attributed graph data. The proposed method surpasses state-of-the-art baseline algorithms by a significant margin on different downstream tasks across popular datasets.
arXiv Detail & Related papers (2024-01-12T17:57:07Z)
Efficient Heterogeneous Graph Learning via Random Projection [58.4138636866903]
Heterogeneous Graph Neural Networks (HGNNs) are powerful tools for deep learning on heterogeneous graphs. Recent pre-computation-based HGNNs use one-time message passing to transform a heterogeneous graph into regular-shaped tensors. We propose a hybrid pre-computation-based HGNN, named Random Projection Heterogeneous Graph Neural Network (RpHGNN)
arXiv Detail & Related papers (2023-10-23T01:25:44Z)
Mirage: Model-Agnostic Graph Distillation for Graph Classification [16.764894668661952]
Graph distillation is an effort to construct a smaller synthetic training set from the original training data. Mirage is built on the insight that a message-passing GNN decomposes the input graph into a multiset of computation trees.
arXiv Detail & Related papers (2023-10-14T04:21:52Z)
NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes. The efficient computation is enabled by a kernerlized Gumbel-Softmax operator. Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z)
Structure-free Graph Condensation: From Large-scale Graphs to Condensed Graph-free Data [91.27527985415007]
Existing graph condensation methods rely on the joint optimization of nodes and structures in the condensed graph. We advocate a new Structure-Free Graph Condensation paradigm, named SFGC, to distill a large-scale graph into a small-scale graph node set.
arXiv Detail & Related papers (2023-06-05T07:53:52Z)
Addressing Heterophily in Node Classification with Graph Echo State Networks [11.52174067809364]
We address the challenges of heterophilic graphs with Graph Echo State Network (GESN) for node classification. GESN is a reservoir computing model for graphs, where node embeddings are computed by an untrained message-passing function. Our experiments show that reservoir models are able to achieve better or comparable accuracy with respect to most fully trained deep models.
arXiv Detail & Related papers (2023-05-14T19:42:31Z)
Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs. Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs. Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z)
Condensing Graphs via One-Step Gradient Matching [50.07587238142548]
We propose a one-step gradient matching scheme, which performs gradient matching for only one single step without training the network weights. Our theoretical analysis shows this strategy can generate synthetic graphs that lead to lower classification loss on real graphs. In particular, we are able to reduce the dataset size by 90% while approximating up to 98% of the original performance.
arXiv Detail & Related papers (2022-06-15T18:20:01Z)
Training Robust Graph Neural Networks with Topology Adaptive Edge Dropping [116.26579152942162]
Graph neural networks (GNNs) are processing architectures that exploit graph structural information to model representations from network data. Despite their success, GNNs suffer from sub-optimal generalization performance given limited training data. This paper proposes Topology Adaptive Edge Dropping to improve generalization performance and learn robust GNN models.
arXiv Detail & Related papers (2021-06-05T13:20:36Z)
Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)
Graph Ordering: Towards the Optimal by Learning [69.72656588714155]
Graph representation learning has achieved a remarkable success in many graph-based applications, such as node classification, prediction, and community detection. However, for some kind of graph applications, such as graph compression and edge partition, it is very hard to reduce them to some graph representation learning tasks. In this paper, we propose to attack the graph ordering problem behind such applications by a novel learning approach.
arXiv Detail & Related papers (2020-01-18T09:14:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.