$\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks
- URL: http://arxiv.org/abs/2302.00193v1
- Date: Wed, 1 Feb 2023 02:54:35 GMT
- Title: $\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks
- Authors: Zeyu Zhu, Fanrong Li, Zitao Mo, Qinghao Hu, Gang Li, Zejian Liu,
Xiaoyao Liang, Jian Cheng
- Abstract summary: We propose the Aggregation-Aware mixed-precision Quantization ($rm A2Q$) for Graph Neural Networks (GNNs)
Our method can achieve up to 11.4% and 9.5% accuracy improvements on the node-level and graph-level tasks, respectively, and up to 2x speedup on a dedicated hardware accelerator.
- Score: 18.772128348519566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As graph data size increases, the vast latency and memory consumption during
inference pose a significant challenge to the real-world deployment of Graph
Neural Networks (GNNs). While quantization is a powerful approach to reducing
GNNs complexity, most previous works on GNNs quantization fail to exploit the
unique characteristics of GNNs, suffering from severe accuracy degradation.
Through an in-depth analysis of the topology of GNNs, we observe that the
topology of the graph leads to significant differences between nodes, and most
of the nodes in a graph appear to have a small aggregation value. Motivated by
this, in this paper, we propose the Aggregation-Aware mixed-precision
Quantization ($\rm A^2Q$) for GNNs, where an appropriate bitwidth is
automatically learned and assigned to each node in the graph. To mitigate the
vanishing gradient problem caused by sparse connections between nodes, we
propose a Local Gradient method to serve the quantization error of the node
features as the supervision during training. We also develop a Nearest Neighbor
Strategy to deal with the generalization on unseen graphs. Extensive
experiments on eight public node-level and graph-level datasets demonstrate the
generality and robustness of our proposed method. Compared to the FP32 models,
our method can achieve up to a 18.6x (i.e., 1.70bit) compression ratio with
negligible accuracy degradation. Morever, compared to the state-of-the-art
quantization method, our method can achieve up to 11.4\% and 9.5\% accuracy
improvements on the node-level and graph-level tasks, respectively, and up to
2x speedup on a dedicated hardware accelerator.
Related papers
- Sparse Decomposition of Graph Neural Networks [20.768412002413843]
We propose an approach to reduce the number of nodes that are included during aggregation.
We achieve this through a sparse decomposition, learning to approximate node representations using a weighted sum of linearly transformed features.
We demonstrate via extensive experiments that our method outperforms other baselines designed for inference speedup.
arXiv Detail & Related papers (2024-10-25T17:52:16Z) - Spectral Greedy Coresets for Graph Neural Networks [61.24300262316091]
The ubiquity of large-scale graphs in node-classification tasks hinders the real-world applications of Graph Neural Networks (GNNs)
This paper studies graph coresets for GNNs and avoids the interdependence issue by selecting ego-graphs based on their spectral embeddings.
Our spectral greedy graph coreset (SGGC) scales to graphs with millions of nodes, obviates the need for model pre-training, and applies to low-homophily graphs.
arXiv Detail & Related papers (2024-05-27T17:52:12Z) - Robust Graph Neural Network based on Graph Denoising [10.564653734218755]
Graph Neural Networks (GNNs) have emerged as a notorious alternative to address learning problems dealing with non-Euclidean datasets.
This work proposes a robust implementation of GNNs that explicitly accounts for the presence of perturbations in the observed topology.
arXiv Detail & Related papers (2023-12-11T17:43:57Z) - NodeFormer: A Scalable Graph Structure Learning Transformer for Node
Classification [70.51126383984555]
We introduce a novel all-pair message passing scheme for efficiently propagating node signals between arbitrary nodes.
The efficient computation is enabled by a kernerlized Gumbel-Softmax operator.
Experiments demonstrate the promising efficacy of the method in various tasks including node classification on graphs.
arXiv Detail & Related papers (2023-06-14T09:21:15Z) - Training Graph Neural Networks on Growing Stochastic Graphs [114.75710379125412]
Graph Neural Networks (GNNs) rely on graph convolutions to exploit meaningful patterns in networked data.
We propose to learn GNNs on very large graphs by leveraging the limit object of a sequence of growing graphs, the graphon.
arXiv Detail & Related papers (2022-10-27T16:00:45Z) - Comprehensive Graph Gradual Pruning for Sparse Training in Graph Neural
Networks [52.566735716983956]
We propose a graph gradual pruning framework termed CGP to dynamically prune GNNs.
Unlike LTH-based methods, the proposed CGP approach requires no re-training, which significantly reduces the computation costs.
Our proposed strategy greatly improves both training and inference efficiency while matching or even exceeding the accuracy of existing methods.
arXiv Detail & Related papers (2022-07-18T14:23:31Z) - Adaptive Kernel Graph Neural Network [21.863238974404474]
Graph neural networks (GNNs) have demonstrated great success in representation learning for graph-structured data.
In this paper, we propose a novel framework - i.e., namely Adaptive Kernel Graph Neural Network (AKGNN)
AKGNN learns to adapt to the optimal graph kernel in a unified manner at the first attempt.
Experiments are conducted on acknowledged benchmark datasets and promising results demonstrate the outstanding performance of our proposed AKGNN.
arXiv Detail & Related papers (2021-12-08T20:23:58Z) - VQ-GNN: A Universal Framework to Scale up Graph Neural Networks using
Vector Quantization [70.8567058758375]
VQ-GNN is a universal framework to scale up any convolution-based GNNs using Vector Quantization (VQ) without compromising the performance.
Our framework avoids the "neighbor explosion" problem of GNNs using quantized representations combined with a low-rank version of the graph convolution matrix.
arXiv Detail & Related papers (2021-10-27T11:48:50Z) - Position-based Hash Embeddings For Scaling Graph Neural Networks [8.87527266373087]
Graph Neural Networks (GNNs) compute node representations by taking into account the topology of the node's ego-network and the features of the ego-network's nodes.
When the nodes do not have high-quality features, GNNs learn an embedding layer to compute node embeddings and use them as input features.
To reduce the memory associated with this embedding layer, hashing-based approaches, commonly used in applications like NLP and recommender systems, can potentially be used.
We present approaches that take advantage of the nodes' position in the graph to dramatically reduce the memory required.
arXiv Detail & Related papers (2021-08-31T22:42:25Z) - Increase and Conquer: Training Graph Neural Networks on Growing Graphs [116.03137405192356]
We consider the problem of learning a graphon neural network (WNN) by training GNNs on graphs sampled Bernoulli from the graphon.
Inspired by these results, we propose an algorithm to learn GNNs on large-scale graphs that, starting from a moderate number of nodes, successively increases the size of the graph during training.
arXiv Detail & Related papers (2021-06-07T15:05:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.