Mobility and Cost Aware Inference Accelerating Algorithm for Edge
Intelligence
- URL: http://arxiv.org/abs/2312.16497v1
- Date: Wed, 27 Dec 2023 10:04:02 GMT
- Title: Mobility and Cost Aware Inference Accelerating Algorithm for Edge
Intelligence
- Authors: Xin Yuan, Ning Li, kang Wei, Wenchao Xu, Quan Chen, Hao Chen, Song Guo
- Abstract summary: The edge intelligence (EI) has been widely applied recently. Spliting the model between device, edge server, and cloud can improve the performance of EI greatly.
The model segmentation without user mobility has been investigated deeply by previous works.
We propose mobility and cost aware model segmentation and resource allocation algorithm for accelerating the inference at edge.
- Score: 24.512525338942158
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The edge intelligence (EI) has been widely applied recently. Spliting the
model between device, edge server, and cloud can improve the performance of EI
greatly. The model segmentation without user mobility has been investigated
deeply by previous works. However, in most use cases of EI, the end devices are
mobile. Only a few works have been carried out on this aspect. These works
still have many issues, such as ignoring the energy consumption of mobile
device, inappropriate network assumption, and low effectiveness on adaptiving
user mobility, etc. Therefore, for addressing the disadvantages of model
segmentation and resource allocation in previous works, we propose mobility and
cost aware model segmentation and resource allocation algorithm for
accelerating the inference at edge (MCSA). Specfically, in the scenario without
user mobility, the loop interation gradient descent (Li-GD) algorithm is
provided. When the mobile user has a large model inference task needs to be
calculated, it will take the energy consumption of mobile user, the
communication and computing resource renting cost, and the inference delay into
account to find the optimal model segmentation and resource allocation
strategy. In the scenario with user mobility, the mobiity aware Li-GD (MLi-GD)
algorithm is proposed to calculate the optimal strategy. Then, the properties
of the proposed algorithms are investigated, including convergence, complexity,
and approximation ratio. The experimental results demonstrate the effectiveness
of the proposed algorithms.
Related papers
- Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge.
Existing methods struggle to balance high model performance with low resource consumption.
We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z) - A QoE-Aware Split Inference Accelerating Algorithm for NOMA-based Edge Intelligence [20.67035066213381]
An effective resource allocation algorithm is proposed in this paper, for accelerating split inference in edge intelligence.
The ERA takes the resource consumption, QoE, and inference latency into account to find the optimal model split strategy and resource allocation strategy.
The experimental results demonstrate that the performance of ERA is much better than that of the previous studies.
arXiv Detail & Related papers (2024-09-25T01:09:45Z) - Resource Management for Low-latency Cooperative Fine-tuning of Foundation Models at the Network Edge [35.40849522296486]
Large-scale foundation models (FoMos) can perform human-like intelligence.
FoMos need to be adapted to specialized downstream tasks through fine-tuning techniques.
We advocate multi-device cooperation within the device-edge cooperative fine-tuning paradigm.
arXiv Detail & Related papers (2024-07-13T12:47:14Z) - High Efficiency Inference Accelerating Algorithm for NOMA-based Mobile
Edge Computing [23.88527790721402]
Splitting the inference model between device, edge server, and cloud can improve the performance of EI greatly.
NOMA, which is the key supporting technologies of B5G/6G, can achieve massive connections and high spectrum efficiency.
We propose the effective communication and computing resource allocation algorithm to accelerate the model inference at edge.
arXiv Detail & Related papers (2023-12-26T02:05:52Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Fast and computationally efficient generative adversarial network
algorithm for unmanned aerial vehicle-based network coverage optimization [1.2853186701496802]
The challenge of dynamic traffic demand in mobile networks is tackled by moving cells based on unmanned aerial vehicles.
Considering the tremendous potential of unmanned aerial vehicles in the future, we propose a new algorithm for coverage optimization.
The proposed algorithm is implemented based on a conditional generative adversarial neural network, with a unique multilayer sum-pooling loss function.
arXiv Detail & Related papers (2022-03-25T12:13:21Z) - Latency-Memory Optimized Splitting of Convolution Neural Networks for
Resource Constrained Edge Devices [1.6873748786804317]
We argue that running CNNs between an edge device and the cloud is synonymous to solving a resource-constrained optimization problem.
Experiments done on real-world edge devices show that, LMOS ensures feasible execution of different CNN models at the edge.
arXiv Detail & Related papers (2021-07-19T19:39:56Z) - Reconfigurable Intelligent Surface Assisted Mobile Edge Computing with
Heterogeneous Learning Tasks [53.1636151439562]
Mobile edge computing (MEC) provides a natural platform for AI applications.
We present an infrastructure to perform machine learning tasks at an MEC with the assistance of a reconfigurable intelligent surface (RIS)
Specifically, we minimize the learning error of all participating users by jointly optimizing transmit power of mobile users, beamforming vectors of the base station, and the phase-shift matrix of the RIS.
arXiv Detail & Related papers (2020-12-25T07:08:50Z) - Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned
Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network.
We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z) - Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning [73.82875010696849]
Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
arXiv Detail & Related papers (2020-03-10T05:52:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.