EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge
- URL: http://arxiv.org/abs/2410.12221v1
- Date: Wed, 16 Oct 2024 04:31:39 GMT
- Title: EdgeRL: Reinforcement Learning-driven Deep Learning Model Inference Optimization at Edge
- Authors: Motahare Mounesan, Xiaojie Zhang, Saptarshi Debroy,
- Abstract summary: We propose EdgeRL framework that seeks to strike balance by using an Advantage Actor-Critic (A2C) Reinforcement Learning (RL) approach.
We evaluate the benefits of EdgeRL framework in terms of end device energy savings, inference accuracy improvement, and end-to-end inference latency reduction.
- Score: 2.8946323553477704
- License:
- Abstract: Balancing mutually diverging performance metrics, such as, processing latency, outcome accuracy, and end device energy consumption is a challenging undertaking for deep learning model inference in ad-hoc edge environments. In this paper, we propose EdgeRL framework that seeks to strike such balance by using an Advantage Actor-Critic (A2C) Reinforcement Learning (RL) approach that can choose optimal run-time DNN inference parameters and aligns the performance metrics based on the application requirements. Using real world deep learning model and a hardware testbed, we evaluate the benefits of EdgeRL framework in terms of end device energy savings, inference accuracy improvement, and end-to-end inference latency reduction.
Related papers
- Adaptive Layer Splitting for Wireless LLM Inference in Edge Computing: A Model-Based Reinforcement Learning Approach [18.153641696306707]
This study introduces a framework taking inspiration from model-based reinforcement learning (MBRL) to determine the optimal splitting point across the edge and user equipment (UE)
By incorporating a reward surrogate model, our approach significantly reduces the computational cost of frequent performance evaluations.
arXiv Detail & Related papers (2024-06-03T09:41:42Z) - End-to-End Learning for Fair Multiobjective Optimization Under
Uncertainty [55.04219793298687]
The Predict-Then-Forecast (PtO) paradigm in machine learning aims to maximize downstream decision quality.
This paper extends the PtO methodology to optimization problems with nondifferentiable Ordered Weighted Averaging (OWA) objectives.
It shows how optimization of OWA functions can be effectively integrated with parametric prediction for fair and robust optimization under uncertainty.
arXiv Detail & Related papers (2024-02-12T16:33:35Z) - etuner: A Redundancy-Aware Framework for Efficient Continual Learning Application on Edge Devices [47.365775210055396]
We propose ETuner, an efficient edge continual learning framework that optimize inference accuracy, fine-tuning execution time, and energy efficiency.
Experimental results show that, on average, ETuner reduces overall fine-tuning execution time by 64%, energy consumption by 56%, and improves average inference accuracy by 1.75% over the immediate model fine-tuning approach.
arXiv Detail & Related papers (2024-01-30T02:41:05Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Lyapunov-Driven Deep Reinforcement Learning for Edge Inference Empowered
by Reconfigurable Intelligent Surfaces [30.1512069754603]
We propose a novel algorithm for energy-efficient, low-latency, accurate inference at the wireless edge.
We consider a scenario where new data are continuously generated/collected by a set of devices and are handled through a dynamic queueing system.
arXiv Detail & Related papers (2023-05-18T12:46:42Z) - Optimal Goal-Reaching Reinforcement Learning via Quasimetric Learning [73.80728148866906]
Quasimetric Reinforcement Learning (QRL) is a new RL method that utilizes quasimetric models to learn optimal value functions.
On offline and online goal-reaching benchmarks, QRL also demonstrates improved sample efficiency and performance.
arXiv Detail & Related papers (2023-04-03T17:59:58Z) - AnycostFL: Efficient On-Demand Federated Learning over Heterogeneous
Edge Devices [20.52519915112099]
We propose a cost-adjustable FL framework, named AnycostFL, that enables diverse edge devices to efficiently perform local updates.
Experiment results indicate that, our learning framework can reduce up to 1.9 times of the training latency and energy consumption for realizing a reasonable global testing accuracy.
arXiv Detail & Related papers (2023-01-08T15:25:55Z) - Design and Prototyping Distributed CNN Inference Acceleration in Edge
Computing [85.74517957717363]
HALP accelerates inference by designing a seamless collaboration among edge devices (EDs) in Edge Computing.
Experiments show that the distributed inference HALP achieves 1.7x inference acceleration for VGG-16.
It is shown that the model selection with distributed inference HALP can significantly improve service reliability.
arXiv Detail & Related papers (2022-11-24T19:48:30Z) - Online Learning for Orchestration of Inference in Multi-User
End-Edge-Cloud Networks [3.6076391721440633]
Collaborative end-edge-cloud computing for deep learning provides a range of performance and efficiency.
We propose a reinforcement-learning-based computation offloading solution that learns optimal offloading policy.
Our solution provides 35% speedup in the average response time compared to the state-of-the-art with less than 0.9% accuracy reduction.
arXiv Detail & Related papers (2022-02-21T21:41:29Z) - Towards Deployment-Efficient Reinforcement Learning: Lower Bound and
Optimality [141.89413461337324]
Deployment efficiency is an important criterion for many real-world applications of reinforcement learning (RL)
We propose a theoretical formulation for deployment-efficient RL (DE-RL) from an "optimization with constraints" perspective.
arXiv Detail & Related papers (2022-02-14T01:31:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.