SGQuant: Squeezing the Last Bit on Graph Neural Networks with
Specialized Quantization
- URL: http://arxiv.org/abs/2007.05100v2
- Date: Wed, 16 Sep 2020 07:13:58 GMT
- Title: SGQuant: Squeezing the Last Bit on Graph Neural Networks with
Specialized Quantization
- Authors: Boyuan Feng, Yuke Wang, Xu Li, Shu Yang, Xueqiao Peng, and Yufei Ding
- Abstract summary: We propose a specialized GNN quantization scheme, SGQuant, to systematically reduce the GNN memory consumption.
We show that SGQuant can effectively reduce the memory footprint from 4.25x to 31.9x compared with the original full-precision GNNs.
- Score: 16.09107108787834
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the increasing popularity of graph-based learning, Graph Neural Networks
(GNNs) win lots of attention from the research and industry field because of
their high accuracy. However, existing GNNs suffer from high memory footprints
(e.g., node embedding features). This high memory footprint hurdles the
potential applications towards memory-constrained devices, such as the
widely-deployed IoT devices. To this end, we propose a specialized GNN
quantization scheme, SGQuant, to systematically reduce the GNN memory
consumption. Specifically, we first propose a GNN-tailored quantization
algorithm design and a GNN quantization fine-tuning scheme to reduce memory
consumption while maintaining accuracy. Then, we investigate the
multi-granularity quantization strategy that operates at different levels
(components, graph topology, and layers) of GNN computation. Moreover, we offer
an automatic bit-selecting (ABS) to pinpoint the most appropriate quantization
bits for the above multi-granularity quantizations. Intensive experiments show
that SGQuant can effectively reduce the memory footprint from 4.25x to 31.9x
compared with the original full-precision GNNs while limiting the accuracy drop
to 0.4% on average.
Related papers
- Spatio-Spectral Graph Neural Networks [50.277959544420455]
We propose Spatio-Spectral Graph Networks (S$2$GNNs)
S$2$GNNs combine spatially and spectrally parametrized graph filters.
We show that S$2$GNNs vanquish over-squashing and yield strictly tighter approximation-theoretic error bounds than MPGNNs.
arXiv Detail & Related papers (2024-05-29T14:28:08Z) - Low-bit Quantization for Deep Graph Neural Networks with
Smoothness-aware Message Propagation [3.9177379733188715]
We present an end-to-end solution that aims to address these challenges for efficient GNNs in resource constrained environments.
We introduce a quantization based approach for all stages of GNNs, from message passing in training to node classification.
The proposed quantizer learns quantization ranges and reduces the model size with comparable accuracy even under low-bit quantization.
arXiv Detail & Related papers (2023-08-29T00:25:02Z) - $\rm A^2Q$: Aggregation-Aware Quantization for Graph Neural Networks [18.772128348519566]
We propose the Aggregation-Aware mixed-precision Quantization ($rm A2Q$) for Graph Neural Networks (GNNs)
Our method can achieve up to 11.4% and 9.5% accuracy improvements on the node-level and graph-level tasks, respectively, and up to 2x speedup on a dedicated hardware accelerator.
arXiv Detail & Related papers (2023-02-01T02:54:35Z) - A Comprehensive Study on Large-Scale Graph Training: Benchmarking and
Rethinking [124.21408098724551]
Large-scale graph training is a notoriously challenging problem for graph neural networks (GNNs)
We present a new ensembling training manner, named EnGCN, to address the existing issues.
Our proposed method has achieved new state-of-the-art (SOTA) performance on large-scale datasets.
arXiv Detail & Related papers (2022-10-14T03:43:05Z) - Edge Inference with Fully Differentiable Quantized Mixed Precision
Neural Networks [1.131071436917293]
Quantizing parameters and operations to lower bit-precision offers substantial memory and energy savings for neural network inference.
This paper proposes a new quantization approach for mixed precision convolutional neural networks (CNNs) targeting edge-computing.
arXiv Detail & Related papers (2022-06-15T18:11:37Z) - Q-SpiNN: A Framework for Quantizing Spiking Neural Networks [14.727296040550392]
A prominent technique for reducing the memory footprint of Spiking Neural Networks (SNNs) without decreasing the accuracy significantly is quantization.
We propose Q-SpiNN, a novel quantization framework for memory-efficient SNNs.
For the unsupervised network, the Q-SpiNN reduces the memory footprint by ca. 4x, while maintaining the accuracy within 1% from the baseline on the MNIST dataset.
For the supervised network, the Q-SpiNN reduces the memory by ca. 2x, while keeping the accuracy within 2% from the baseline on the DVS-Gesture dataset
arXiv Detail & Related papers (2021-07-05T06:01:15Z) - Quantized Neural Networks via {-1, +1} Encoding Decomposition and
Acceleration [83.84684675841167]
We propose a novel encoding scheme using -1, +1 to decompose quantized neural networks (QNNs) into multi-branch binary networks.
We validate the effectiveness of our method on large-scale image classification, object detection, and semantic segmentation tasks.
arXiv Detail & Related papers (2021-06-18T03:11:15Z) - A Unified Lottery Ticket Hypothesis for Graph Neural Networks [82.31087406264437]
We present a unified GNN sparsification (UGS) framework that simultaneously prunes the graph adjacency matrix and the model weights.
We further generalize the popular lottery ticket hypothesis to GNNs for the first time, by defining a graph lottery ticket (GLT) as a pair of core sub-dataset and sparse sub-network.
arXiv Detail & Related papers (2021-02-12T21:52:43Z) - Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data.
In this paper, we present and evaluate different strategies for the binarization of graph neural networks.
We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z) - Learned Low Precision Graph Neural Networks [10.269500440688306]
We show how to systematically quantise Deep Graph Neural Networks (GNNs) with minimal or no loss in performance using Network Architecture Search (NAS)
The proposed novel NAS mechanism, named Low Precision Graph NAS (LPGNAS), constrains both architecture and quantisation choices to be differentiable.
On eight different datasets, solving the task of classifying unseen nodes in a graph, LPGNAS generates quantised models with significant reductions in both model and buffer sizes.
arXiv Detail & Related papers (2020-09-19T13:51:09Z) - Widening and Squeezing: Towards Accurate and Efficient QNNs [125.172220129257]
Quantization neural networks (QNNs) are very attractive to the industry because their extremely cheap calculation and storage overhead, but their performance is still worse than that of networks with full-precision parameters.
Most of existing methods aim to enhance performance of QNNs especially binary neural networks by exploiting more effective training techniques.
We address this problem by projecting features in original full-precision networks to high-dimensional quantization features.
arXiv Detail & Related papers (2020-02-03T04:11:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.