Related papers: A structure-aware framework for learning device placements on computation graphs

A structure-aware framework for learning device placements on computation graphs

URL: http://arxiv.org/abs/2405.14185v1
Date: Thu, 23 May 2024 05:29:29 GMT
Title: A structure-aware framework for learning device placements on computation graphs
Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Peiyu Zhang, Panagiotis Kyriakis, Nesreen K. Ahmed, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan,
Abstract summary: We propose a novel framework for the task of device placement, relying on smaller graphs extracted from the OpenVINO toolkit. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models.
Score: 15.282882425920064
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing approaches for device placement ignore the topological features of computation graphs and rely mostly on heuristic methods for graph partitioning. At the same time, they either follow a grouper-placer or an encoder-placer architecture, which requires understanding the interaction structure between code operations. To bridge the gap between encoder-placer and grouper-placer techniques, we propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit using reinforcement learning. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into consideration the directed and acyclic nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and personalized graph partitioning jointly, using an unspecified number of groups. To train the entire framework, we utilize reinforcement learning techniques by employing the execution time of the suggested device placements to formulate the reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to $58.2\%$ over CPU execution and by up to $60.24\%$ compared to other commonly used baselines.

Related papers

FedGraph: A Research Library and Benchmark for Federated Graph Learning [40.257355007504074]
We introduce FedGraph, a research library built for practical distributed deployment and benchmarking in federated graph learning. FedGraph supports a range of state-of-the-art graph learning methods and includes built-in profiling tools to evaluate system performance. We demonstrate the first privacy-preserving federated learning system to run on graphs with 100 million nodes.
arXiv Detail & Related papers (2024-10-08T20:18:18Z)
Amplify Graph Learning for Recommendation via Sparsity Completion [16.32861024767423]
Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems. Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges. We propose an Amplify Graph Learning framework based on Sparsity Completion (called AGL-SC)
arXiv Detail & Related papers (2024-06-27T08:26:20Z)
Graph Transformer GANs with Graph Masked Modeling for Architectural Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations. The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions. We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z)
An FPGA-Based Accelerator for Graph Embedding using Sequential Training Algorithm [0.8403582577557918]
We propose to combine an online sequential training algorithm with node2vec to handle the changes of graph structures after the deployment. The proposed FPGA implementation achieves up to 205.25 times speedup compared to the original model on ARM Cortex-A53 CPU.
arXiv Detail & Related papers (2023-12-23T02:24:59Z)
Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions. By finding a mean in this embedding space, we can recover a mean graph that preserves structural information. We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z)
Fisher Information Embedding for Node and Graph Learning [5.263910852465186]
We propose a novel attention-based node embedding framework for graphs. Our framework builds upon a hierarchical kernel for multisets of subgraphs around nodes. We provide theoretical insights into generalizability and expressivity of our embeddings.
arXiv Detail & Related papers (2023-05-12T16:15:30Z)
End-to-end Mapping in Heterogeneous Systems Using Graph Representation Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework. It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core. In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z)
Joint Graph Learning and Matching for Semantic Feature Correspondence [69.71998282148762]
We propose a joint emphgraph learning and matching network, named GLAM, to explore reliable graph structures for boosting graph matching. The proposed method is evaluated on three popular visual matching benchmarks (Pascal VOC, Willow Object and SPair-71k) It outperforms previous state-of-the-art graph matching methods by significant margins on all benchmarks.
arXiv Detail & Related papers (2021-09-01T08:24:02Z)
Effective and Efficient Graph Learning for Multi-view Clustering [173.8313827799077]
We propose an effective and efficient graph learning model for multi-view clustering. Our method exploits the view-similar between graphs of different views by the minimization of tensor Schatten p-norm. Our proposed algorithm is time-economical and obtains the stable results and scales well with the data size.
arXiv Detail & Related papers (2021-08-15T13:14:28Z)
Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods. We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z)
A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE. AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution. Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z)
GraphOpt: Learning Optimization Models of Graph Formation [72.75384705298303]
We propose an end-to-end framework that learns an implicit model of graph structure formation and discovers an underlying optimization mechanism. The learned objective can serve as an explanation for the observed graph properties, thereby lending itself to transfer across different graphs within a domain. GraphOpt poses link formation in graphs as a sequential decision-making process and solves it using maximum entropy inverse reinforcement learning algorithm.
arXiv Detail & Related papers (2020-07-07T16:51:39Z)
Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural Networks [2.4235475271758076]
We propose framework for generating structured multi level block sparse neural networks by using the theory of Graph products. We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity. We benchmark our approach by experimenting on image classification task over CIFAR dataset using VGG19 and WideResnet-40-4 networks.
arXiv Detail & Related papers (2020-06-24T05:08:17Z)
Heuristic Semi-Supervised Learning for Graph Generation Inspired by Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph. In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z)
Embedding Graph Auto-Encoder for Graph Clustering [90.8576971748142]
Graph auto-encoder (GAE) models are based on semi-supervised graph convolution networks (GCN) We design a specific GAE-based model for graph clustering to be consistent with the theory, namely Embedding Graph Auto-Encoder (EGAE) EGAE consists of one encoder and dual decoders.
arXiv Detail & Related papers (2020-02-20T09:53:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.