A Structure-Aware Framework for Learning Device Placements on Computation Graphs
- URL: http://arxiv.org/abs/2405.14185v2
- Date: Sun, 12 Jan 2025 04:56:19 GMT
- Title: A Structure-Aware Framework for Learning Device Placements on Computation Graphs
- Authors: Shukai Duan, Heng Ping, Nikos Kanakaris, Xiongye Xiao, Panagiotis Kyriakis, Nesreen K. Ahmed, Peiyu Zhang, Guixiang Ma, Mihai Capota, Shahin Nazarian, Theodore L. Willke, Paul Bogdan,
- Abstract summary: We propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit.
The framework consists of five steps, including graph coarsening, node representation learning and policy optimization.
To train the entire framework, we use reinforcement learning using the execution time of the placement as a reward.
- Score: 15.282882425920064
- License:
- Abstract: Computation graphs are Directed Acyclic Graphs (DAGs) where the nodes correspond to mathematical operations and are used widely as abstractions in optimizations of neural networks. The device placement problem aims to identify optimal allocations of those nodes to a set of (potentially heterogeneous) devices. Existing approaches rely on two types of architectures known as grouper-placer and encoder-placer, respectively. In this work, we bridge the gap between encoder-placer and grouper-placer techniques and propose a novel framework for the task of device placement, relying on smaller computation graphs extracted from the OpenVINO toolkit. The framework consists of five steps, including graph coarsening, node representation learning and policy optimization. It facilitates end-to-end training and takes into account the DAG nature of the computation graphs. We also propose a model variant, inspired by graph parsing networks and complex network analysis, enabling graph representation learning and jointed, personalized graph partitioning, using an unspecified number of groups. To train the entire framework, we use reinforcement learning using the execution time of the placement as a reward. We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models, namely Inception-V3, ResNet, and BERT. The robustness of the proposed framework is also highlighted through an ablation study. The suggested placements improve the inference speed for the benchmark models by up to 58.2% over CPU execution and by up to 60.24% compared to other commonly used baselines.
Related papers
- Amplify Graph Learning for Recommendation via Sparsity Completion [16.32861024767423]
Graph learning models have been widely deployed in collaborative filtering (CF) based recommendation systems.
Due to the issue of data sparsity, the graph structure of the original input lacks potential positive preference edges.
We propose an Amplify Graph Learning framework based on Sparsity Completion (called AGL-SC)
arXiv Detail & Related papers (2024-06-27T08:26:20Z) - Graph Transformer GANs with Graph Masked Modeling for Architectural
Layout Generation [153.92387500677023]
We present a novel graph Transformer generative adversarial network (GTGAN) to learn effective graph node relations.
The proposed graph Transformer encoder combines graph convolutions and self-attentions in a Transformer to model both local and global interactions.
We also propose a novel self-guided pre-training method for graph representation learning.
arXiv Detail & Related papers (2024-01-15T14:36:38Z) - An FPGA-Based Accelerator for Graph Embedding using Sequential Training Algorithm [0.8403582577557918]
We propose to combine an online sequential training algorithm with node2vec to handle the changes of graph structures after the deployment.
The proposed FPGA implementation achieves up to 205.25 times speedup compared to the original model on ARM Cortex-A53 CPU.
arXiv Detail & Related papers (2023-12-23T02:24:59Z) - Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions.
By finding a mean in this embedding space, we can recover a mean graph that preserves structural information.
We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z) - Fisher Information Embedding for Node and Graph Learning [5.263910852465186]
We propose a novel attention-based node embedding framework for graphs.
Our framework builds upon a hierarchical kernel for multisets of subgraphs around nodes.
We provide theoretical insights into generalizability and expressivity of our embeddings.
arXiv Detail & Related papers (2023-05-12T16:15:30Z) - End-to-end Mapping in Heterogeneous Systems Using Graph Representation
Learning [13.810753108848582]
We propose a unified, end-to-end, programmable graph representation learning framework.
It is capable of mining the complexity of high-level programs down to the universal intermediate representation, extracting the specific computational patterns and predicting which code segments would run best on a specific core.
In the evaluation, we demonstrate a maximum speedup of 6.42x compared to the thread-based execution, and 2.02x compared to the state-of-the-art technique.
arXiv Detail & Related papers (2022-04-25T22:13:13Z) - Group Contrastive Self-Supervised Learning on Graphs [101.45974132613293]
We study self-supervised learning on graphs using contrastive methods.
We argue that contrasting graphs in multiple subspaces enables graph encoders to capture more abundant characteristics.
arXiv Detail & Related papers (2021-07-20T22:09:21Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Ramanujan Bipartite Graph Products for Efficient Block Sparse Neural
Networks [2.4235475271758076]
We propose framework for generating structured multi level block sparse neural networks by using the theory of Graph products.
We also propose to use products of Ramanujan graphs which gives the best connectivity for a given level of sparsity.
We benchmark our approach by experimenting on image classification task over CIFAR dataset using VGG19 and WideResnet-40-4 networks.
arXiv Detail & Related papers (2020-06-24T05:08:17Z) - Heuristic Semi-Supervised Learning for Graph Generation Inspired by
Electoral College [80.67842220664231]
We propose a novel pre-processing technique, namely ELectoral COllege (ELCO), which automatically expands new nodes and edges to refine the label similarity within a dense subgraph.
In all setups tested, our method boosts the average score of base models by a large margin of 4.7 points, as well as consistently outperforms the state-of-the-art.
arXiv Detail & Related papers (2020-06-10T14:48:48Z) - Embedding Graph Auto-Encoder for Graph Clustering [90.8576971748142]
Graph auto-encoder (GAE) models are based on semi-supervised graph convolution networks (GCN)
We design a specific GAE-based model for graph clustering to be consistent with the theory, namely Embedding Graph Auto-Encoder (EGAE)
EGAE consists of one encoder and dual decoders.
arXiv Detail & Related papers (2020-02-20T09:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.