Real-Time Edge Classification: Optimal Offloading under Token Bucket
Constraints
- URL: http://arxiv.org/abs/2010.13737v2
- Date: Thu, 5 Nov 2020 22:48:21 GMT
- Title: Real-Time Edge Classification: Optimal Offloading under Token Bucket
Constraints
- Authors: Ayan Chakrabarti, Roch Gu\'erin, Chenyang Lu, Jiangnan Liu
- Abstract summary: We introduce a Markov Decision Process-based framework to make offload decisions under strict latency constraints.
We also propose approaches to allow multiple devices connected to the same access switch to share their bursting allocation.
We evaluate and analyze the policies derived using our framework on the standard ImageNet image classification benchmark.
- Score: 13.583977689847433
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To deploy machine learning-based algorithms for real-time applications with
strict latency constraints, we consider an edge-computing setting where a
subset of inputs are offloaded to the edge for processing by an accurate but
resource-intensive model, and the rest are processed only by a less-accurate
model on the device itself. Both models have computational costs that match
available compute resources, and process inputs with low-latency. But
offloading incurs network delays, and to manage these delays to meet
application deadlines, we use a token bucket to constrain the average rate and
burst length of transmissions from the device. We introduce a Markov Decision
Process-based framework to make offload decisions under these constraints,
based on the local model's confidence and the token bucket state, with the goal
of minimizing a specified error measure for the application. Beyond isolated
decisions for individual devices, we also propose approaches to allow multiple
devices connected to the same access switch to share their bursting allocation.
We evaluate and analyze the policies derived using our framework on the
standard ImageNet image classification benchmark.
Related papers
- Resource Management for Low-latency Cooperative Fine-tuning of Foundation Models at the Network Edge [35.40849522296486]
Large-scale foundation models (FoMos) can perform human-like intelligence.
FoMos need to be adapted to specialized downstream tasks through fine-tuning techniques.
We advocate multi-device cooperation within the device-edge cooperative fine-tuning paradigm.
arXiv Detail & Related papers (2024-07-13T12:47:14Z) - Fractional Deep Reinforcement Learning for Age-Minimal Mobile Edge
Computing [11.403989519949173]
This work focuses on the timeliness of computational-intensive updates, measured by Age-ofInformation (AoI)
We study how to jointly optimize the task updating and offloading policies for AoI with fractional form.
Experimental results show that our proposed algorithms reduce the average AoI by up to 57.6% compared with several non-fractional benchmarks.
arXiv Detail & Related papers (2023-12-16T11:13:40Z) - Task-Oriented Over-the-Air Computation for Multi-Device Edge AI [57.50247872182593]
6G networks for supporting edge AI features task-oriented techniques that focus on effective and efficient execution of AI task.
Task-oriented over-the-air computation (AirComp) scheme is proposed in this paper for multi-device split-inference system.
arXiv Detail & Related papers (2022-11-02T16:35:14Z) - Adaptive Edge Offloading for Image Classification Under Rate Limit [18.029207345709413]
The paper develops a policy based on a Deep Q-Network (DQN), and demonstrates both its efficacy and the feasibility of its deployment on embedded devices.
The evaluation is carried out by performing image classification over a local testbed using synthetic traces generated from the ImageNet image classification benchmark.
arXiv Detail & Related papers (2022-07-31T18:06:33Z) - Asynchronous Parallel Incremental Block-Coordinate Descent for
Decentralized Machine Learning [55.198301429316125]
Machine learning (ML) is a key technique for big-data-driven modelling and analysis of massive Internet of Things (IoT) based intelligent and ubiquitous computing.
For fast-increasing applications and data amounts, distributed learning is a promising emerging paradigm since it is often impractical or inefficient to share/aggregate data.
This paper studies the problem of training an ML model over decentralized systems, where data are distributed over many user devices.
arXiv Detail & Related papers (2022-02-07T15:04:15Z) - An Adaptive Device-Edge Co-Inference Framework Based on Soft
Actor-Critic [72.35307086274912]
High-dimension parameter model and large-scale mathematical calculation restrict execution efficiency, especially for Internet of Things (IoT) devices.
We propose a new Deep Reinforcement Learning (DRL)-Soft Actor Critic for discrete (SAC-d), which generates the emphexit point, emphexit point, and emphcompressing bits by soft policy iterations.
Based on the latency and accuracy aware reward design, such an computation can well adapt to the complex environment like dynamic wireless channel and arbitrary processing, and is capable of supporting the 5G URL
arXiv Detail & Related papers (2022-01-09T09:31:50Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned
Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network.
We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z) - Multi-scale Interaction for Real-time LiDAR Data Segmentation on an
Embedded Platform [62.91011959772665]
Real-time semantic segmentation of LiDAR data is crucial for autonomously driving vehicles.
Current approaches that operate directly on the point cloud use complex spatial aggregation operations.
We propose a projection-based method, called Multi-scale Interaction Network (MINet), which is very efficient and accurate.
arXiv Detail & Related papers (2020-08-20T19:06:11Z) - Dynamic Compression Ratio Selection for Edge Inference Systems with Hard
Deadlines [9.585931043664363]
We propose a dynamic compression ratio selection scheme for edge inference system with hard deadlines.
Information augmentation that retransmits less compressed data of task with erroneous inference is proposed to enhance the accuracy performance.
Considering the wireless transmission errors, we further design a retransmission scheme to reduce performance degradation due to packet losses.
arXiv Detail & Related papers (2020-05-25T17:11:53Z) - Knowledge Distillation for Mobile Edge Computation Offloading [14.417463848473494]
We propose an edge computation offloading framework based on Deep Imitation Learning (DIL) and Knowledge Distillation (KD)
Our model has the shortest inference delay among all policies.
arXiv Detail & Related papers (2020-04-09T04:58:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.