PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep
Learning Models on Edge Devices
- URL: http://arxiv.org/abs/2301.10999v1
- Date: Thu, 26 Jan 2023 08:59:15 GMT
- Title: PerfSAGE: Generalized Inference Performance Predictor for Arbitrary Deep
Learning Models on Edge Devices
- Authors: Yuji Chai, Devashree Tripathy, Chuteng Zhou, Dibakar Gope, Igor
Fedorov, Ramon Matas, David Brooks, Gu-Yeon Wei, Paul Whatmough
- Abstract summary: This paper describes PerfSAGE, a novel graph neural network (GNN) that predicts inference latency, energy, and memory footprint on an arbitrary DNNlite graph.
Using this dataset, we train PerfSAGE and provide experimental results that demonstrate state-of-the-art prediction accuracy with a Mean Absolute Percentage Error of 5% across all targets and model search spaces.
- Score: 8.272409756443539
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The ability to accurately predict deep neural network (DNN) inference
performance metrics, such as latency, power, and memory footprint, for an
arbitrary DNN on a target hardware platform is essential to the design of DNN
based models. This ability is critical for the (manual or automatic) design,
optimization, and deployment of practical DNNs for a specific hardware
deployment platform. Unfortunately, these metrics are slow to evaluate using
simulators (where available) and typically require measurement on the target
hardware. This work describes PerfSAGE, a novel graph neural network (GNN) that
predicts inference latency, energy, and memory footprint on an arbitrary DNN
TFlite graph (TFL, 2017). In contrast, previously published performance
predictors can only predict latency and are restricted to pre-defined
construction rules or search spaces. This paper also describes the EdgeDLPerf
dataset of 134,912 DNNs randomly sampled from four task search spaces and
annotated with inference performance metrics from three edge hardware
platforms. Using this dataset, we train PerfSAGE and provide experimental
results that demonstrate state-of-the-art prediction accuracy with a Mean
Absolute Percentage Error of <5% across all targets and model search spaces.
These results: (1) Outperform previous state-of-art GNN-based predictors
(Dudziak et al., 2020), (2) Accurately predict performance on accelerators (a
shortfall of non-GNN-based predictors (Zhang et al., 2021)), and (3)
Demonstrate predictions on arbitrary input graphs without modifications to the
feature extractor.
Related papers
- RoCP-GNN: Robust Conformal Prediction for Graph Neural Networks in Node-Classification [0.0]
Graph Neural Networks (GNNs) have emerged as powerful tools for predicting outcomes in graph-structured data.
One way to address this issue is by providing prediction sets that contain the true label with a predefined probability margin.
We propose a novel approach termed Robust Conformal Prediction for GNNs (RoCP-GNN)
Our approach robustly predicts outcomes with any predictive GNN model while quantifying the uncertainty in predictions within the realm of graph-based semi-supervised learning (SSL)
arXiv Detail & Related papers (2024-08-25T12:51:19Z) - Anole: Adapting Diverse Compressed Models For Cross-Scene Prediction On Mobile Devices [17.542012577533015]
Anole is a light-weight scheme to cope with the local DNN model inference on mobile devices.
We implement Anole on different types of mobile devices and conduct extensive trace-driven and real-world experiments based on unmanned aerial vehicles (UAVs)
arXiv Detail & Related papers (2024-05-09T12:06:18Z) - FR-NAS: Forward-and-Reverse Graph Predictor for Efficient Neural Architecture Search [10.699485270006601]
We introduce a novel Graph Neural Networks (GNN) predictor for Neural Architecture Search (NAS)
This predictor renders neural architectures into vector representations by combining both the conventional and inverse graph views.
The experimental results showcase a significant improvement in prediction accuracy, with a 3%--16% increase in Kendall-tau correlation.
arXiv Detail & Related papers (2024-04-24T03:22:49Z) - Inferring Data Preconditions from Deep Learning Models for Trustworthy
Prediction in Deployment [25.527665632625627]
It is important to reason about the trustworthiness of the model's predictions with unseen data during deployment.
Existing methods for specifying and verifying traditional software are insufficient for this task.
We propose a novel technique that uses rules derived from neural network computations to infer data preconditions.
arXiv Detail & Related papers (2024-01-26T03:47:18Z) - Uncertainty Quantification over Graph with Conformalized Graph Neural
Networks [52.20904874696597]
Graph Neural Networks (GNNs) are powerful machine learning prediction models on graph-structured data.
GNNs lack rigorous uncertainty estimates, limiting their reliable deployment in settings where the cost of errors is significant.
We propose conformalized GNN (CF-GNN), extending conformal prediction (CP) to graph-based models for guaranteed uncertainty estimates.
arXiv Detail & Related papers (2023-05-23T21:38:23Z) - Boosted Dynamic Neural Networks [53.559833501288146]
A typical EDNN has multiple prediction heads at different layers of the network backbone.
To optimize the model, these prediction heads together with the network backbone are trained on every batch of training data.
Treating training and testing inputs differently at the two phases will cause the mismatch between training and testing data distributions.
We formulate an EDNN as an additive model inspired by gradient boosting, and propose multiple training techniques to optimize the model effectively.
arXiv Detail & Related papers (2022-11-30T04:23:12Z) - TEP-GNN: Accurate Execution Time Prediction of Functional Tests using
Graph Neural Networks [5.899031548148629]
We propose a predictive model, dubbed TEP-GNN, which demonstrates that high-accuracy performance prediction is possible.
TEP-GNN uses FA-ASTs, or flow-augmented ASTs, as a graph-based code representation approach.
We evaluate TEP-GNN using four real-life Java open source programs, based on 922 test files mined from the projects' public repositories.
arXiv Detail & Related papers (2022-08-25T09:08:32Z) - ANNETTE: Accurate Neural Network Execution Time Estimation with Stacked
Models [56.21470608621633]
We propose a time estimation framework to decouple the architectural search from the target hardware.
The proposed methodology extracts a set of models from micro- kernel and multi-layer benchmarks and generates a stacked model for mapping and network execution time estimation.
We compare estimation accuracy and fidelity of the generated mixed models, statistical models with the roofline model, and a refined roofline model for evaluation.
arXiv Detail & Related papers (2021-05-07T11:39:05Z) - The Surprising Power of Graph Neural Networks with Random Node
Initialization [54.4101931234922]
Graph neural networks (GNNs) are effective models for representation learning on relational data.
Standard GNNs are limited in their expressive power, as they cannot distinguish beyond the capability of the Weisfeiler-Leman graph isomorphism.
In this work, we analyze the expressive power of GNNs with random node (RNI)
We prove that these models are universal, a first such result for GNNs not relying on computationally demanding higher-order properties.
arXiv Detail & Related papers (2020-10-02T19:53:05Z) - Distance Encoding: Design Provably More Powerful Neural Networks for
Graph Representation Learning [63.97983530843762]
Graph Neural Networks (GNNs) have achieved great success in graph representation learning.
GNNs generate identical representations for graph substructures that may in fact be very different.
More powerful GNNs, proposed recently by mimicking higher-order tests, are inefficient as they cannot sparsity of underlying graph structure.
We propose Distance Depiction (DE) as a new class of graph representation learning.
arXiv Detail & Related papers (2020-08-31T23:15:40Z) - ProphetNet: Predicting Future N-gram for Sequence-to-Sequence
Pre-training [85.35910219651572]
We present a new sequence-to-sequence pre-training model called ProphetNet.
It introduces a novel self-supervised objective named future n-gram prediction.
We conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks.
arXiv Detail & Related papers (2020-01-13T05:12:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.