LEAPER: Modeling Cloud FPGA-based Systems via Transfer Learning
- URL: http://arxiv.org/abs/2208.10606v1
- Date: Mon, 22 Aug 2022 21:25:56 GMT
- Title: LEAPER: Modeling Cloud FPGA-based Systems via Transfer Learning
- Authors: Gagandeep Singh, Dionysios Diamantopoulos, Juan G\'omez-Luna, Sander
Stuijk, Henk Corporaal, Onur Mutlu
- Abstract summary: We propose LEAPER, a transfer learning-based approach for FPGA-based systems that adapts an existing ML-based model to a new, unknown environment.
Results show that our approach delivers, on average, 85% accuracy when we use our transferred model for prediction in a cloud environment with 5-shot learning.
- Score: 13.565689665335697
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Machine-learning-based models have recently gained traction as a way to
overcome the slow downstream implementation process of FPGAs by building models
that provide fast and accurate performance predictions. However, these models
suffer from two main limitations: (1) training requires large amounts of data
(features extracted from FPGA synthesis and implementation reports), which is
cost-inefficient because of the time-consuming FPGA design cycle; (2) a model
trained for a specific environment cannot predict for a new, unknown
environment. In a cloud system, where getting access to platforms is typically
costly, data collection for ML models can significantly increase the total
cost-ownership (TCO) of a system. To overcome these limitations, we propose
LEAPER, a transfer learning-based approach for FPGA-based systems that adapts
an existing ML-based model to a new, unknown environment to provide fast and
accurate performance and resource utilization predictions. Experimental results
show that our approach delivers, on average, 85% accuracy when we use our
transferred model for prediction in a cloud environment with 5-shot learning
and reduces design-space exploration time by 10x, from days to only a few
hours.
Related papers
- Efficient Motion Prediction: A Lightweight & Accurate Trajectory Prediction Model With Fast Training and Inference Speed [56.27022390372502]
We propose a new efficient motion prediction model, which achieves highly competitive benchmark results while training only a few hours on a single GPU.
Its low inference latency makes it particularly suitable for deployment in autonomous applications with limited computing resources.
arXiv Detail & Related papers (2024-09-24T14:58:27Z) - rule4ml: An Open-Source Tool for Resource Utilization and Latency Estimation for ML Models on FPGA [0.0]
This paper introduces a novel method to predict the resource utilization and inference latency of Neural Networks (NNs) before their synthesis and implementation on FPGA.
We leverage HLS4ML, a tool-flow that helps translate NNs into high-level synthesis (HLS) code.
Our method uses trained regression models for immediate pre-synthesis predictions.
arXiv Detail & Related papers (2024-08-09T19:35:10Z) - Understanding the Potential of FPGA-Based Spatial Acceleration for Large Language Model Inference [11.614722231006695]
Large language models (LLMs) boasting billions of parameters have generated a significant demand for efficient deployment in inference workloads.
This paper investigates the feasibility and potential of model-specific spatial acceleration for LLM inference on FPGAs.
arXiv Detail & Related papers (2023-12-23T04:27:06Z) - Symbolic Regression on FPGAs for Fast Machine Learning Inference [2.0920303420933273]
High-energy physics community is investigating the potential of deploying machine-learning-based solutions on Field-Programmable Gate Arrays (FPGAs)
We introduce a novel end-to-end procedure that utilizes a machine learning technique called symbolic regression (SR)
We show that our approach can approximate a 3-layer neural network using an inference model that achieves up to a 13-fold decrease in execution time, down to 5 ns, while still preserving more than 90% approximation accuracy.
arXiv Detail & Related papers (2023-05-06T17:04:02Z) - FlexMoE: Scaling Large-scale Sparse Pre-trained Model Training via
Dynamic Device Placement [19.639936387834677]
Mixture-of-Experts (MoEs) are becoming more popular and have demonstrated impressive pretraining scalability in various downstream tasks.
MoEs are becoming a new data analytics paradigm in the data life cycle and suffering from unique challenges at scales, complexities, and granularities never before possible.
In this paper, we propose a novel DNN training framework, FlexMoE, which systematically and transparently address the inefficiency caused by dynamic dataflow.
arXiv Detail & Related papers (2023-04-08T07:34:26Z) - Online Evolutionary Neural Architecture Search for Multivariate
Non-Stationary Time Series Forecasting [72.89994745876086]
This work presents the Online Neuro-Evolution-based Neural Architecture Search (ONE-NAS) algorithm.
ONE-NAS is a novel neural architecture search method capable of automatically designing and dynamically training recurrent neural networks (RNNs) for online forecasting tasks.
Results demonstrate that ONE-NAS outperforms traditional statistical time series forecasting methods.
arXiv Detail & Related papers (2023-02-20T22:25:47Z) - Real-time Neural-MPC: Deep Learning Model Predictive Control for
Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline.
We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z) - Learning representations with end-to-end models for improved remaining
useful life prognostics [64.80885001058572]
The remaining Useful Life (RUL) of equipment is defined as the duration between the current time and its failure.
We propose an end-to-end deep learning model based on multi-layer perceptron and long short-term memory layers (LSTM) to predict the RUL.
We will discuss how the proposed end-to-end model is able to achieve such good results and compare it to other deep learning and state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T16:45:18Z) - Semi-supervised on-device neural network adaptation for remote and
portable laser-induced breakdown spectroscopy [0.22843885788439797]
We introduce a lightweight multi-layer perceptron (MLP) model for LIBS that can be adapted on-device without requiring labels for new input data.
It shows 89.3% average accuracy during data streaming, and up to 2.1% better accuracy compared to an model that does not support adaptation.
arXiv Detail & Related papers (2021-04-08T00:20:36Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - Fast-Convergent Federated Learning [82.32029953209542]
Federated learning is a promising solution for distributing machine learning tasks through modern networks of mobile devices.
We propose a fast-convergent federated learning algorithm, called FOLB, which performs intelligent sampling of devices in each round of model training.
arXiv Detail & Related papers (2020-07-26T14:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.