Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate
Feature Compression and Edge Learning
- URL: http://arxiv.org/abs/2205.11854v1
- Date: Tue, 24 May 2022 07:29:33 GMT
- Title: Multi-Agent Collaborative Inference via DNN Decoupling: Intermediate
Feature Compression and Edge Learning
- Authors: Zhiwei Hao, Guanyu Xu, Yong Luo, Han Hu, Jianping An, Shiwen Mao
- Abstract summary: We study the multi-agent collaborative inference scenario, where a single edge server coordinates the inference of multiple UEs.
To achieve this goal, we first design a lightweight autoencoder-based method to compress the large intermediate feature.
Then we define tasks according to the inference overhead of DNNs and formulate the problem as a Markov decision process.
- Score: 31.291738577705257
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, deploying deep neural network (DNN) models via collaborative
inference, which splits a pre-trained model into two parts and executes them on
user equipment (UE) and edge server respectively, becomes attractive. However,
the large intermediate feature of DNN impedes flexible decoupling, and existing
approaches either focus on the single UE scenario or simply define tasks
considering the required CPU cycles, but ignore the indivisibility of a single
DNN layer. In this paper, we study the multi-agent collaborative inference
scenario, where a single edge server coordinates the inference of multiple UEs.
Our goal is to achieve fast and energy-efficient inference for all UEs. To
achieve this goal, we first design a lightweight autoencoder-based method to
compress the large intermediate feature. Then we define tasks according to the
inference overhead of DNNs and formulate the problem as a Markov decision
process (MDP). Finally, we propose a multi-agent hybrid proximal policy
optimization (MAHPPO) algorithm to solve the optimization problem with a hybrid
action space. We conduct extensive experiments with different types of
networks, and the results show that our method can reduce up to 56\% of
inference latency and save up to 72\% of energy consumption.
Related papers
- A Bayesian Framework of Deep Reinforcement Learning for Joint O-RAN/MEC
Orchestration [12.914011030970814]
Multi-access Edge Computing (MEC) can be implemented together with Open Radio Access Network (O-RAN) over commodity platforms to offer low-cost deployment.
In this paper, a joint O-RAN/MEC orchestration using a Bayesian deep reinforcement learning (RL)-based framework is proposed.
arXiv Detail & Related papers (2023-12-26T18:04:49Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - A Multi-objective Complex Network Pruning Framework Based on
Divide-and-conquer and Global Performance Impairment Ranking [40.59001171151929]
A multi-objective complex network pruning framework based on divide-and-conquer and global performance impairment ranking is proposed in this paper.
The proposed algorithm achieves a comparable performance with the state-of-the-art pruning methods.
arXiv Detail & Related papers (2023-03-28T12:05:15Z) - Multi-Prompt Alignment for Multi-Source Unsupervised Domain Adaptation [86.02485817444216]
We introduce Multi-Prompt Alignment (MPA), a simple yet efficient framework for multi-source UDA.
MPA denoises the learned prompts through an auto-encoding process and aligns them by maximizing the agreement of all the reconstructed prompts.
Experiments show that MPA achieves state-of-the-art results on three popular datasets with an impressive average accuracy of 54.1% on DomainNet.
arXiv Detail & Related papers (2022-09-30T03:40:10Z) - Receptive Field-based Segmentation for Distributed CNN Inference
Acceleration in Collaborative Edge Computing [93.67044879636093]
We study inference acceleration using distributed convolutional neural networks (CNNs) in collaborative edge computing network.
We propose a novel collaborative edge computing using fused-layer parallelization to partition a CNN model into multiple blocks of convolutional layers.
arXiv Detail & Related papers (2022-07-22T18:38:11Z) - Decoupled and Memory-Reinforced Networks: Towards Effective Feature
Learning for One-Step Person Search [65.51181219410763]
One-step methods have been developed to handle pedestrian detection and identification sub-tasks using a single network.
There are two major challenges in the current one-step approaches.
We propose a decoupled and memory-reinforced network (DMRNet) to overcome these problems.
arXiv Detail & Related papers (2021-02-22T06:19:45Z) - Boundary-assisted Region Proposal Networks for Nucleus Segmentation [89.69059532088129]
Machine learning models cannot perform well because of large amount of crowded nuclei.
We devise a Boundary-assisted Region Proposal Network (BRP-Net) that achieves robust instance-level nucleus segmentation.
arXiv Detail & Related papers (2020-06-04T08:26:38Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Multiple Access in Dynamic Cell-Free Networks: Outage Performance and
Deep Reinforcement Learning-Based Design [24.632250413917816]
In future cell-free (or cell-less) wireless networks, a large number of devices in a geographical area will be served simultaneously by a large number of distributed access points (APs)
We propose a novel dynamic cell-free network architecture to reduce the complexity of joint processing of users' signals in presence of a large number of devices and APs.
In our system setting, the proposed DDPG-DDQN scheme is found to achieve around $78%$ of the rate achievable through an exhaustive search-based design.
arXiv Detail & Related papers (2020-01-29T03:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.