Influence-Based Mini-Batching for Graph Neural Networks
- URL: http://arxiv.org/abs/2212.09083v1
- Date: Sun, 18 Dec 2022 13:27:01 GMT
- Title: Influence-Based Mini-Batching for Graph Neural Networks
- Authors: Johannes Gasteiger, Chendi Qian, Stephan G\"unnemann
- Abstract summary: We propose influence-based mini-batching for graph neural networks.
IBMB accelerates inference by up to 130x compared to previous methods.
This results in up to 18x faster training per epoch and up to 17x faster convergence per runtime compared to previous methods.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Using graph neural networks for large graphs is challenging since there is no
clear way of constructing mini-batches. To solve this, previous methods have
relied on sampling or graph clustering. While these approaches often lead to
good training convergence, they introduce significant overhead due to expensive
random data accesses and perform poorly during inference. In this work we
instead focus on model behavior during inference. We theoretically model batch
construction via maximizing the influence score of nodes on the outputs. This
formulation leads to optimal approximation of the output when we do not have
knowledge of the trained model. We call the resulting method influence-based
mini-batching (IBMB). IBMB accelerates inference by up to 130x compared to
previous methods that reach similar accuracy. Remarkably, with adaptive
optimization and the right training schedule IBMB can also substantially
accelerate training, thanks to precomputed batches and consecutive memory
accesses. This results in up to 18x faster training per epoch and up to 17x
faster convergence per runtime compared to previous methods.
Related papers
- Truncated Consistency Models [57.50243901368328]
Training consistency models requires learning to map all intermediate points along PF ODE trajectories to their corresponding endpoints.
We empirically find that this training paradigm limits the one-step generation performance of consistency models.
We propose a new parameterization of the consistency function and a two-stage training procedure that prevents the truncated-time training from collapsing to a trivial solution.
arXiv Detail & Related papers (2024-10-18T22:38:08Z) - CDFGNN: a Systematic Design of Cache-based Distributed Full-Batch Graph Neural Network Training with Communication Reduction [7.048300785744331]
Graph neural network training is mainly categorized into mini-batch and full-batch training methods.
In the distributed cluster, frequent remote accesses of features and gradients lead to huge communication overhead.
We introduce the cached-based distributed full-batch graph neural network training framework (CDFGNN)
Our results indicate that CDFGNN has great potential in accelerating distributed full-batch GNN training tasks.
arXiv Detail & Related papers (2024-08-01T01:57:09Z) - Boosting Low-Data Instance Segmentation by Unsupervised Pre-training
with Saliency Prompt [103.58323875748427]
This work offers a novel unsupervised pre-training solution for low-data regimes.
Inspired by the recent success of the Prompting technique, we introduce a new pre-training method that boosts QEIS models.
Experimental results show that our method significantly boosts several QEIS models on three datasets.
arXiv Detail & Related papers (2023-02-02T15:49:03Z) - Prior-mean-assisted Bayesian optimization application on FRIB Front-End
tunning [61.78406085010957]
We exploit a neural network model trained over historical data as a prior mean of BO for FRIB Front-End tuning.
In this paper, we exploit a neural network model trained over historical data as a prior mean of BO for FRIB Front-End tuning.
arXiv Detail & Related papers (2022-11-11T18:34:15Z) - Towards Sparsification of Graph Neural Networks [9.568566305616656]
We use two state-of-the-art model compression methods to train and prune and sparse training for the sparsification of weight layers in GNNs.
We evaluate and compare the efficiency of both methods in terms of accuracy, training sparsity, and training FLOPs on real-world graphs.
arXiv Detail & Related papers (2022-09-11T01:39:29Z) - Simpler is Better: off-the-shelf Continual Learning Through Pretrained
Backbones [0.0]
We propose a baseline (off-the-shelf) for Continual Learning of Computer Vision problems.
We exploit the power of pretrained models to compute a class prototype and fill a memory bank.
We compare our pipeline with common CNN models and show the superiority of Vision Transformers.
arXiv Detail & Related papers (2022-05-03T16:03:46Z) - Scaling Knowledge Graph Embedding Models [12.757685697180946]
We propose a new method for scaling training of knowledge graph embedding models for link prediction.
Our scaling solution for GNN-based knowledge graph embedding models achieves a 16x speed up on benchmark datasets.
arXiv Detail & Related papers (2022-01-08T08:34:52Z) - Combining Label Propagation and Simple Models Out-performs Graph Neural
Networks [52.121819834353865]
We show that for many standard transductive node classification benchmarks, we can exceed or match the performance of state-of-the-art GNNs.
We call this overall procedure Correct and Smooth (C&S)
Our approach exceeds or nearly matches the performance of state-of-the-art GNNs on a wide variety of benchmarks.
arXiv Detail & Related papers (2020-10-27T02:10:52Z) - Accurate, Efficient and Scalable Training of Graph Neural Networks [9.569918335816963]
Graph Neural Networks (GNNs) are powerful deep learning models to generate node embeddings on graphs.
It is still challenging to perform training in an efficient and scalable way.
We propose a novel parallel training framework that reduces training workload by orders of magnitude compared with state-of-the-art minibatch methods.
arXiv Detail & Related papers (2020-10-05T22:06:23Z) - Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose.
We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.