Related papers: Joint Multi-User DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence

Joint Multi-User DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence

URL: http://arxiv.org/abs/2007.09072v1
Date: Wed, 15 Jul 2020 09:40:13 GMT
Title: Joint Multi-User DNN Partitioning and Computational Resource Allocation for Collaborative Edge Intelligence
Authors: Xin Tang and Xu Chen and Liekang Zeng and Shuai Yu and Lin Chen
Abstract summary: Mobile Edge Computing (MEC) has emerged as a promising supporting architecture providing a variety of resources to the network edge. With the assistance of edge servers, user equipments (UEs) are able to run deep neural network (DNN) based AI applications. We propose an algorithm called Iterative Alternating Optimization (IAO) that can achieve the optimal solution in time.
Score: 21.55340197267767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Mobile Edge Computing (MEC) has emerged as a promising supporting architecture providing a variety of resources to the network edge, thus acting as an enabler for edge intelligence services empowering massive mobile and Internet of Things (IoT) devices with AI capability. With the assistance of edge servers, user equipments (UEs) are able to run deep neural network (DNN) based AI applications, which are generally resource-hungry and compute-intensive, such that an individual UE can hardly afford by itself in real time. However the resources in each individual edge server are typically limited. Therefore, any resource optimization involving edge servers is by nature a resource-constrained optimization problem and needs to be tackled in such realistic context. Motivated by this observation, we investigate the optimization problem of DNN partitioning (an emerging DNN offloading scheme) in a realistic multi-user resource-constrained condition that rarely considered in previous works. Despite the extremely large solution space, we reveal several properties of this specific optimization problem of joint multi-UE DNN partitioning and computational resource allocation. We propose an algorithm called Iterative Alternating Optimization (IAO) that can achieve the optimal solution in polynomial time. In addition, we present rigorous theoretic analysis of our algorithm in terms of time complexity and performance under realistic estimation error. Moreover, we build a prototype that implements our framework and conduct extensive experiments using realistic DNN models, whose results demonstrate its effectiveness and efficiency.

Related papers

Privacy-Aware Joint DNN Model Deployment and Partition Optimization for Delay-Efficient Collaborative Edge Inference [14.408050197587654]
Edge inference (EI) is a key solution to address the growing challenges of delayed response times, limited scalability, and privacy concerns in cloud-based Deep Neural Network (DNN) inference. This paper proposes a novel framework for privacy-aware joint DNN model deployment and partition optimization to minimize long-term average inference delay under resource and privacy constraints.
arXiv Detail & Related papers (2025-02-22T05:27:24Z)
DNN Partitioning, Task Offloading, and Resource Allocation in Dynamic Vehicular Networks: A Lyapunov-Guided Diffusion-Based Reinforcement Learning Approach [49.56404236394601]
We formulate the problem of joint DNN partitioning, task offloading, and resource allocation in Vehicular Edge Computing. Our objective is to minimize the DNN-based task completion time while guaranteeing the system stability over time. We propose a Multi-Agent Diffusion-based Deep Reinforcement Learning (MAD2RL) algorithm, incorporating the innovative use of diffusion models.
arXiv Detail & Related papers (2024-06-11T06:31:03Z)
Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization [20.631476379056892]
Large Language Models (LLMs) are at the forefront of this movement. LLMs require cloud hosting, which raises issues regarding privacy, latency, and usage limitations. We present an edge intelligence optimization problem tailored for LLM inference.
arXiv Detail & Related papers (2024-05-12T02:38:58Z)
Towards Leveraging AutoML for Sustainable Deep Learning: A Multi-Objective HPO Approach on Deep Shift Neural Networks [16.314030132923026]
We study the impact of hyperparameter optimization (HPO) to maximize DSNN performance while minimizing resource consumption. Experimental results demonstrate the effectiveness of our approach, resulting in models with over 80% in accuracy and low computational cost.
arXiv Detail & Related papers (2024-04-02T14:03:37Z)
A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs) MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z)
GNN at the Edge: Cost-Efficient Graph Neural Network Processing over Distributed Edge Servers [24.109721494781592]
Graph Neural Networks (GNNs) are still under exploration, presenting a stark disparity to its broad edge adoptions. This paper studies the cost optimization for distributed GNN processing over a multi-tier heterogeneous edge network. We show that our approach achieves superior performance over de facto baselines with more than 95.8% cost eduction in a fast convergence speed.
arXiv Detail & Related papers (2022-10-31T13:03:16Z)
Computational Intelligence and Deep Learning for Next-Generation Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks. In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework. In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z)
Adaptive Anomaly Detection for Internet of Things in Hierarchical Edge Computing: A Contextual-Bandit Approach [81.5261621619557]
We propose an adaptive anomaly detection scheme with hierarchical edge computing (HEC) We first construct multiple anomaly detection DNN models with increasing complexity, and associate each of them to a corresponding HEC layer. Then, we design an adaptive model selection scheme that is formulated as a contextual-bandit problem and solved by using a reinforcement learning policy network.
arXiv Detail & Related papers (2021-08-09T08:45:47Z)
Latency-Memory Optimized Splitting of Convolution Neural Networks for Resource Constrained Edge Devices [1.6873748786804317]
We argue that running CNNs between an edge device and the cloud is synonymous to solving a resource-constrained optimization problem. Experiments done on real-world edge devices show that, LMOS ensures feasible execution of different CNN models at the edge.
arXiv Detail & Related papers (2021-07-19T19:39:56Z)
Resource Allocation via Model-Free Deep Learning in Free Space Optical Communications [119.81868223344173]
The paper investigates the general problem of resource allocation for mitigating channel fading effects in Free Space Optical (FSO) communications. Under this framework, we propose two algorithms that solve FSO resource allocation problems.
arXiv Detail & Related papers (2020-07-27T17:38:51Z)
Self-Directed Online Machine Learning for Topology Optimization [58.920693413667216]
Self-directed Online Learning Optimization integrates Deep Neural Network (DNN) with Finite Element Method (FEM) calculations. Our algorithm was tested by four types of problems including compliance minimization, fluid-structure optimization, heat transfer enhancement and truss optimization. It reduced the computational time by 2 5 orders of magnitude compared with directly using methods, and outperformed all state-of-the-art algorithms tested in our experiments.
arXiv Detail & Related papers (2020-02-04T20:00:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.