AGO: Boosting Mobile AI Inference Performance by Removing Constraints on
Graph Optimization
- URL: http://arxiv.org/abs/2212.01005v1
- Date: Fri, 2 Dec 2022 07:16:49 GMT
- Title: AGO: Boosting Mobile AI Inference Performance by Removing Constraints on
Graph Optimization
- Authors: Zhiying Xu, Hongding Peng, Wei Wang
- Abstract summary: AGO is a framework for graph optimization with arbitrary structures to boost the inference performance of deep models.
We propose intensive operator fusion to stitch multiple complex operators together for better performance.
We show that our system can improve the inference performance by up to 3.3x when compared with state-of-the-art deep compilers.
- Score: 6.4284258345779435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Traditional deep learning compilers rely on heuristics for subgraph
generation, which impose extra constraints on graph optimization, e.g., each
subgraph can only contain at most one complex operator. In this paper, we
propose AGO, a framework for graph optimization with arbitrary structures to
boost the inference performance of deep models by removing such constraints. To
create new optimization opportunities for complicated subgraphs, we propose
intensive operator fusion, which can effectively stitch multiple complex
operators together for better performance. Further, we design a graph
partitioning scheme that allows an arbitrary structure for each subgraph while
guaranteeing the acyclic property among all generated subgraphs. Additionally,
to enable efficient performance tuning on complicated subgraphs, we devise a
novel divide-and-conquer tuning mechanism to orchestrate different system
components. Through extensive experiments on various neural networks and mobile
devices, we show that our system can improve the inference performance by up to
3.3x when compared with state-of-the-art deep compilers.
Related papers
- GraphCroc: Cross-Correlation Autoencoder for Graph Structural Reconstruction [6.817416560637197]
Graph autoencoders (GAEs) reconstruct graph structures from node embeddings.
We introduce a cross-correlation mechanism that significantly enhances the GAE representational capabilities.
We also propose GraphCroc, a new GAE that supports flexible encoder architectures tailored for various downstream tasks.
arXiv Detail & Related papers (2024-10-04T12:59:45Z) - Bayesian Optimization of Functions over Node Subsets in Graphs [14.670181702535825]
We propose a novel framework for optimization on graphs.
We map each $k$-node in the original graph to a node in a new graph.
Experiments under both synthetic and real-world setups demonstrate the effectiveness of the proposed BO framework.
arXiv Detail & Related papers (2024-05-24T00:24:55Z) - A structure-aware framework for learning device placements on computation graphs [15.282882425920064]
We propose a novel framework for the task of device placement, relying on smaller graphs extracted from the OpenVINO toolkit.
The framework consists of five steps, including graph coarsening, node representation learning and policy optimization.
We demonstrate the flexibility and effectiveness of our approach through multiple experiments with three benchmark models.
arXiv Detail & Related papers (2024-05-23T05:29:29Z) - From Hypergraph Energy Functions to Hypergraph Neural Networks [94.88564151540459]
We present an expressive family of parameterized, hypergraph-regularized energy functions.
We then demonstrate how minimizers of these energies effectively serve as node embeddings.
We draw parallels between the proposed bilevel hypergraph optimization, and existing GNN architectures in common use.
arXiv Detail & Related papers (2023-06-16T04:40:59Z) - Performance Embeddings: A Similarity-based Approach to Automatic
Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications.
We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z) - Hector: An Efficient Programming and Compilation Framework for Implementing Relational Graph Neural Networks in GPU Architectures [24.841128441671234]
RGNNs are graph neural networks with dedicated structures for modeling the different types of nodes and edges in heterogeneous graphs.
We propose Hector, a novel two-level intermediate representation and its code generator framework, to capture the key properties of RGNN models.
Hector achieves up to 9.9x speed-up in inference and 43.7x speed-up in training compared with the state-of-the-art public systems.
arXiv Detail & Related papers (2023-01-16T06:53:18Z) - ALT: Breaking the Wall between Graph and Operator Level Optimizations
for Deep Learning Compilation [38.8918502461244]
ALT is a compiler that performs joint graph- and operator-level optimizations for deep models.
JOG significantly outperforms state-of-the-art compilers (e.g., Ansor) in terms of both single operator performance and end-to-end inference performance.
arXiv Detail & Related papers (2022-10-22T11:09:36Z) - Graph Contrastive Learning Automated [94.41860307845812]
Graph contrastive learning (GraphCL) has emerged with promising representation learning performance.
The effectiveness of GraphCL hinges on ad-hoc data augmentations, which have to be manually picked per dataset.
This paper proposes a unified bi-level optimization framework to automatically, adaptively and dynamically select data augmentations when performing GraphCL on specific graph data.
arXiv Detail & Related papers (2021-06-10T16:35:27Z) - A Robust and Generalized Framework for Adversarial Graph Embedding [73.37228022428663]
We propose a robust framework for adversarial graph embedding, named AGE.
AGE generates the fake neighbor nodes as the enhanced negative samples from the implicit distribution.
Based on this framework, we propose three models to handle three types of graph data.
arXiv Detail & Related papers (2021-05-22T07:05:48Z) - Counting Substructures with Higher-Order Graph Neural Networks:
Possibility and Impossibility Results [58.277290855841976]
We study tradeoffs of computational cost and expressive power of Graph Neural Networks (GNNs)
We show that a new model can count subgraphs of size $k$, and thereby overcomes a known limitation of low-order GNNs.
In several cases, the proposed algorithm can greatly reduce computational complexity compared to the existing higher-order $k$-GNNs.
arXiv Detail & Related papers (2020-12-06T03:42:54Z) - Structured Convolutions for Efficient Neural Network Design [65.36569572213027]
We tackle model efficiency by exploiting redundancy in the textitimplicit structure of the building blocks of convolutional neural networks.
We show how this decomposition can be applied to 2D and 3D kernels as well as the fully-connected layers.
arXiv Detail & Related papers (2020-08-06T04:38:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.