Related papers: PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training

URL: http://arxiv.org/abs/2507.11683v2
Date: Sun, 20 Jul 2025 18:40:27 GMT
Title: PGT-I: Scaling Spatiotemporal GNNs with Memory-Efficient Distributed Training
Authors: Seth Ockerman, Amal Gueroudji, Tanwi Mallick, Yixuan He, Line Pouchard, Robert Ross, Shivaram Venkataraman,
Abstract summary: We present PyTorch Temporal Geometric Index (GTP-I), an extension to PyTorch Geometric Temporaltemporal Network (STG-NN)<n>GTP-I integrates distributed data parallel training and two strategies: index-batching and distributed-index-batching.<n>Our techniques enable the first-ever training of an STG-NN on the entire PeMS dataset without graph.
Score: 5.495404608974733
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Spatiotemporal graph neural networks (ST-GNNs) are powerful tools for modeling spatial and temporal data dependencies. However, their applications have been limited primarily to small-scale datasets because of memory constraints. While distributed training offers a solution, current frameworks lack support for spatiotemporal models and overlook the properties of spatiotemporal data. Informed by a scaling study on a large-scale workload, we present PyTorch Geometric Temporal Index (PGT-I), an extension to PyTorch Geometric Temporal that integrates distributed data parallel training and two novel strategies: index-batching and distributed-index-batching. Our index techniques exploit spatiotemporal structure to construct snapshots dynamically at runtime, significantly reducing memory overhead, while distributed-index-batching extends this approach by enabling scalable processing across multiple GPUs. Our techniques enable the first-ever training of an ST-GNN on the entire PeMS dataset without graph partitioning, reducing peak memory usage by up to 89% and achieving up to a 11.78x speedup over standard DDP with 128 GPUs.

Related papers

Enhanced Soups for Graph Neural Networks [5.242305867893238]
"souping" individually trained Graph Neural Networks (GNNs) can improve performance without increasing compute and memory costs during inference.<n>We introduce Learned Souping for GNNs, a gradient-descent-based souping strategy that substantially reduces time and memory overhead.<n>We also propose Partition Learned Souping, a novel partition-based variant of learned souping that significantly reduces memory usage.
arXiv Detail & Related papers (2025-03-14T17:29:27Z)
ST-FiT: Inductive Spatial-Temporal Forecasting with Limited Training Data [59.78770412981611]
In real-world applications, most nodes may not possess any available temporal data during training.<n>We propose a principled framework named ST-FiT to handle this problem.
arXiv Detail & Related papers (2024-12-14T17:51:29Z)
Pre-Training Identification of Graph Winning Tickets in Adaptive Spatial-Temporal Graph Neural Networks [5.514795777097036]
We introduce the concept of the Graph Winning Ticket (GWT), derived from the Lottery Ticket Hypothesis (LTH) By adopting a pre-determined star topology as a GWT prior to training, we balance edge reduction with efficient information propagation. Our approach enables training ASTGNNs on the largest scale spatial-temporal dataset using a single A6000 equipped with 48 GB of memory.
arXiv Detail & Related papers (2024-06-12T14:53:23Z)
Mending of Spatio-Temporal Dependencies in Block Adjacency Matrix [3.529869282529924]
We propose a novel end-to-end learning architecture designed to mend the temporal dependencies, resulting in a well-connected graph. Our methodology demonstrates superior performance on benchmark datasets, such as SurgVisDom and C2D2.
arXiv Detail & Related papers (2023-10-04T06:42:33Z)
Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training. We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z)
Temporal Aggregation and Propagation Graph Neural Networks for Dynamic Representation [67.26422477327179]
Temporal graphs exhibit dynamic interactions between nodes over continuous time. We propose a novel method of temporal graph convolution with the whole neighborhood. Our proposed TAP-GNN outperforms existing temporal graph methods by a large margin in terms of both predictive performance and online inference latency.
arXiv Detail & Related papers (2023-04-15T08:17:18Z)
Fast Temporal Wavelet Graph Neural Networks [7.477634824955323]
We propose Fast Temporal Wavelet Graph Neural Networks (FTWGNN) for learning tasks on timeseries data. We employ Multiresolution Matrix Factorization (MMF) to factorize the highly dense graph structure and compute the corresponding sparse wavelet basis. Experimental results on real-world PEMS-BAY, METR-LA traffic datasets and AJILE12 ECoG dataset show that FTWGNN is competitive with the state-of-the-arts.
arXiv Detail & Related papers (2023-02-17T01:21:45Z)
NumS: Scalable Array Programming for the Cloud [82.827921577004]
We present NumS, an array programming library which optimize NumPy-like expressions on task-based distributed systems. This is achieved through a novel scheduler called Load Simulated Hierarchical Scheduling (LSHS) We show that LSHS enhances performance on Ray by decreasing network load by a factor of 2x, requiring 4x less memory, and reducing execution time by 10x on the logistic regression problem.
arXiv Detail & Related papers (2022-06-28T20:13:40Z)
Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA [5.575293536755127]
Real-world applications require high performance inference on real-time streaming dynamic graphs. We present a novel model-architecture co-design for inference in memory-based TGNNs on FPGAs. We train our simplified models using knowledge distillation to ensure similar accuracy vis-'a-vis the original model.
arXiv Detail & Related papers (2022-03-10T00:24:47Z)
Accelerating Training and Inference of Graph Neural Networks with Fast Sampling and Pipelining [58.10436813430554]
Mini-batch training of graph neural networks (GNNs) requires a lot of computation and data movement. We argue in favor of performing mini-batch training with neighborhood sampling in a distributed multi-GPU environment. We present a sequence of improvements to mitigate these bottlenecks, including a performance-engineered neighborhood sampler. We also conduct an empirical analysis that supports the use of sampling for inference, showing that test accuracies are not materially compromised.
arXiv Detail & Related papers (2021-10-16T02:41:35Z)
Spatio-Temporal Graph Scattering Transform [54.52797775999124]
Graph neural networks may be impractical in some real-world scenarios due to a lack of sufficient high-quality training data. We put forth a novel mathematically designed framework to analyze-temporal data.
arXiv Detail & Related papers (2020-12-06T19:49:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.