Related papers: Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement

Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement

URL: http://arxiv.org/abs/2201.09676v1
Date: Fri, 21 Jan 2022 09:27:48 GMT
Title: Accelerate Model Parallel Training by Using Efficient Graph Traversal Order in Device Placement
Authors: Tianze Wang, Amir H. Payberah, Desta Haileselassie Hagos, Vladimir Vlassov
Abstract summary: Modern neural networks require long training to reach decent performance on massive datasets. One common approach to speed up training is model parallelization, where large neural networks are split across multiple devices. Most of the existing device placement solutions treat the problem as sequential decision-making.
Score: 1.577134752543077
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Modern neural networks require long training to reach decent performance on massive datasets. One common approach to speed up training is model parallelization, where large neural networks are split across multiple devices. However, different device placements of the same neural network lead to different training times. Most of the existing device placement solutions treat the problem as sequential decision-making by traversing neural network graphs and assigning their neurons to different devices. This work studies the impact of graph traversal order on device placement. In particular, we empirically study how different graph traversal order leads to different device placement, which in turn affects the training execution time. Our experiment results show that the best graph traversal order depends on the type of neural networks and their computation graphs features. In this work, we also provide recommendations on choosing graph traversal order in device placement for various neural network families to improve the training time in model parallelization.

Related papers

Graph Spring Neural ODEs for Link Sign Prediction [49.71046810937725]
We propose a novel message-passing layer architecture called Graph Spring Network (GSN) modeled after spring forces. We show that our method achieves accuracy close to the state-of-the-art methods with node generation time speedup factors of up to 28,000 on large graphs.
arXiv Detail & Related papers (2024-12-17T13:50:20Z)
Communication-Free Distributed GNN Training with Vertex Cut [63.22674903170953]
CoFree-GNN is a novel distributed GNN training framework that significantly speeds up the training process by implementing communication-free training. We demonstrate that CoFree-GNN speeds up the GNN training process by up to 10 times over the existing state-of-the-art GNN training approaches.
arXiv Detail & Related papers (2023-08-06T21:04:58Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Multirate Training of Neural Networks [0.0]
We show that for various transfer learning applications in vision and NLP we can fine-tune deep neural networks in almost half the time. We propose an additional multirate technique which can learn different features present in the data by training the full network on different time scales simultaneously.
arXiv Detail & Related papers (2021-06-20T22:44:55Z)
Variational models for signal processing with Graph Neural Networks [3.5939555573102853]
This paper is devoted to signal processing on point-clouds by means of neural networks. In this work, we investigate the use of variational models for such Graph Neural Networks to process signals on graphs for unsupervised learning.
arXiv Detail & Related papers (2021-03-30T13:31:11Z)
ItNet: iterative neural networks with small graphs for accurate and efficient anytime prediction [1.52292571922932]
In this study, we introduce a class of network models that have a small memory footprint in terms of their computational graphs. We show state-of-the-art results for semantic segmentation on the CamVid and Cityscapes datasets.
arXiv Detail & Related papers (2021-01-21T15:56:29Z)
Binary Graph Neural Networks [69.51765073772226]
Graph Neural Networks (GNNs) have emerged as a powerful and flexible framework for representation learning on irregular data. In this paper, we present and evaluate different strategies for the binarization of graph neural networks. We show that through careful design of the models, and control of the training process, binary graph neural networks can be trained at only a moderate cost in accuracy on challenging benchmarks.
arXiv Detail & Related papers (2020-12-31T18:48:58Z)
Graph-Based Neural Network Models with Multiple Self-Supervised Auxiliary Tasks [79.28094304325116]
Graph Convolutional Networks are among the most promising approaches for capturing relationships among structured data points. We propose three novel self-supervised auxiliary tasks to train graph-based neural network models in a multi-task fashion.
arXiv Detail & Related papers (2020-11-14T11:09:51Z)
Graph Structure of Neural Networks [104.33754950606298]
We show how the graph structure of neural networks affect their predictive performance. A "sweet spot" of relational graphs leads to neural networks with significantly improved predictive performance. Top-performing neural networks have graph structure surprisingly similar to those of real biological neural networks.
arXiv Detail & Related papers (2020-07-13T17:59:31Z)
Graph Neural Networks and Reinforcement Learning for Behavior Generation in Semantic Environments [3.1410342959104725]
We propose combining graph neural networks with actor-critic reinforcement learning. As graph neural networks apply the same network to every vehicle, they are invariant to the number and order of vehicles. We show that graph neural networks are capable of handling scenarios with a varying number and order of vehicles during training and application.
arXiv Detail & Related papers (2020-06-22T19:24:52Z)
Analyzing Neural Networks Based on Random Graphs [77.34726150561087]
We perform a massive evaluation of neural networks with architectures corresponding to random graphs of various types. We find that none of the classical numerical graph invariants by itself allows to single out the best networks. We also find that networks with primarily short-range connections perform better than networks which allow for many long-range connections.
arXiv Detail & Related papers (2020-02-19T11:04:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.