A structure-aware framework for learning device placements on computation graphs
- URL: http://arxiv.org/abs/2405.14185v1
- Date: Thu, 23 May 2024 05:29:29 GMT
- Title: A structure-aware framework for learning device placements on computation graphs
- Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Peiyu Zhang, Panagiotis Kyriakis, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan,
- Abstract summary: We propose a novel framework for the task of device placement, relying on smaller graphs extracted from the OpenVINO toolkit.
The framework consists of five steps, including graph coarsening, node representation learning and policy optimization.
We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models.
- Score: 15.282882425920064
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit using reinforcement learning. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into consideration the directed and acyclic nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and personalized graph partitioning jointly, using an unspecified number of groups. To train the entire framework, we utilize reinforcement learning techniques by employing the execution time of the suggested device placements to formulate the reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to $58.2\%$ over CPU execution and by up to $60.24\%$ compared to other commonly used baselines.
Related papers
- FedGraph: A Research Library and Benchmark for Federated Graph Learning [40.257355007504074]
We introduce FedGraph, a research library built for practical distributed deployment and benchmarking in federated graph learning.
FedGraph supports a range of state-of-the-art graph learning methods and includes built-in profiling tools to evaluate system performance.
We demonstrate the first privacy-preserving federated learning system to run on graphs with 100 million nodes.
arXiv Detail & Related papers (2024-10-08T20:18:18Z) - Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions.
By finding a mean in this embedding space, we can recover a mean graph that preserves structural information.
We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z) - End-to-end Mapping in Heterogeneous Systems Using Graph Representation
Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework.
It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core.
In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z) - Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching.
The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k)
It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z) - Effective and Efficient Graph Learning for Multi-view Clustering [173.8313827799077]
We propose an effective and efficient graph learning model for multi-view clustering.
Our method exploits the view-similar between graphs of different views by the minimization of tensor Schatten p-norm.
Our proposed algorithm is time-economical and obtains the stable results and scales well with the data size.
arXiv Detail & Related papers (2021-08-15T13:14:28Z) - Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods.
We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - GraphOpt: Learning Optimization Models of Graph Formation [72.75384705298303]
We propose an end-to-end framework that learns an implicit model of graph structure formation and discovers an underlying optimization mechanism.
The learned objective can serve as an explanation for the observed graph properties, thereby lending itself to transfer across different graphs within a domain.
GraphOpt poses link formation in graphs as a sequential decision-making process and solves it using maximum entropy inverse reinforcement learning algorithm.
arXiv Detail & Related papers (2020-07-07T16:51:39Z) - Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural
Networks [2.4235475271758076]
We propose framework for generating structured multi level block sparse neural networks by using the theory of Graph products.
We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity.
We benchmark our approach by experimenting on image classification task over CIFAR dataset using VGG19 and WideResnet-40-4 networks.
arXiv Detail & Related papers (2020-06-24T05:08:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.