Learning the Optimal Path and DNN Partition for Collaborative Edge Inference
- URL: http://arxiv.org/abs/2410.01857v1
- Date: Wed, 2 Oct 2024 01:12:16 GMT
- Title: Learning the Optimal Path and DNN Partition for Collaborative Edge Inference
- Authors: Yin Huang, Letian Zhang, Jie Xu,
- Abstract summary: Deep Neural Networks (DNNs) have catalyzed the development of numerous intelligent mobile applications and services.
To address this, collaborative edge inference has been proposed.
This method involves partitioning a DNN inference task into several subtasks and distributing these across multiple network nodes.
We introduce a new bandit algorithm, B-EXPUCB, which combines elements of the classical blocked EXP3 and LinUCB algorithms, and demonstrate its sublinear regret.
- Score: 4.368333109035076
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent advancements in Deep Neural Networks (DNNs) have catalyzed the development of numerous intelligent mobile applications and services. However, they also introduce significant computational challenges for resource-constrained mobile devices. To address this, collaborative edge inference has been proposed. This method involves partitioning a DNN inference task into several subtasks and distributing these across multiple network nodes. Despite its potential, most current approaches presume known network parameters -- like node processing speeds and link transmission rates -- or rely on a fixed sequence of nodes for processing the DNN subtasks. In this paper, we tackle a more complex scenario where network parameters are unknown and must be learned, and multiple network paths are available for distributing inference tasks. Specifically, we explore the learning problem of selecting the optimal network path and assigning DNN layers to nodes along this path, considering potential security threats and the costs of switching paths. We begin by deriving structural insights from the DNN layer assignment with complete network information, which narrows down the decision space and provides crucial understanding of optimal assignments. We then cast the learning problem with incomplete network information as a novel adversarial group linear bandits problem with switching costs, featuring rewards generation through a combined stochastic and adversarial process. We introduce a new bandit algorithm, B-EXPUCB, which combines elements of the classical blocked EXP3 and LinUCB algorithms, and demonstrate its sublinear regret. Extensive simulations confirm B-EXPUCB's superior performance in learning for collaborative edge inference over existing algorithms.
Related papers
- Learning State-Augmented Policies for Information Routing in
Communication Networks [92.59624401684083]
We develop a novel State Augmentation (SA) strategy to maximize the aggregate information at source nodes using graph neural network (GNN) architectures.
We leverage an unsupervised learning procedure to convert the output of the GNN architecture to optimal information routing strategies.
In the experiments, we perform the evaluation on real-time network topologies to validate our algorithms.
arXiv Detail & Related papers (2023-09-30T04:34:25Z) - Scalable Resource Management for Dynamic MEC: An Unsupervised
Link-Output Graph Neural Network Approach [36.32772317151467]
Deep learning has been successfully adopted in mobile edge computing (MEC) to optimize task offloading and resource allocation.
The dynamics of edge networks raise two challenges in neural network (NN)-based optimization methods: low scalability and high training costs.
In this paper, a novel link-output GNN (LOGNN)-based resource management approach is proposed to flexibly optimize the resource allocation in MEC.
arXiv Detail & Related papers (2023-06-15T08:21:41Z) - Semi-supervised Network Embedding with Differentiable Deep Quantisation [81.49184987430333]
We develop d-SNEQ, a differentiable quantisation method for network embedding.
d-SNEQ incorporates a rank loss to equip the learned quantisation codes with rich high-order information.
It is able to substantially compress the size of trained embeddings, thus reducing storage footprint and accelerating retrieval speed.
arXiv Detail & Related papers (2021-08-20T11:53:05Z) - Deep Neural Networks and PIDE discretizations [2.4063592468412276]
We propose neural networks that tackle the problems of stability and field-of-view of a Convolutional Neural Network (CNN)
We propose integral-based spatially nonlocal operators which are related to global weighted Laplacian, fractional Laplacian and fractional inverse Laplacian operators.
We test the effectiveness of the proposed neural architectures on benchmark image classification datasets and semantic segmentation tasks in autonomous driving.
arXiv Detail & Related papers (2021-08-05T08:03:01Z) - Algorithm Unrolling for Massive Access via Deep Neural Network with
Theoretical Guarantee [30.86806523281873]
Massive access is a critical design challenge of Internet of Things (IoT) networks.
We consider the grant-free uplink transmission of an IoT network with a multiple-antenna base station (BS) and a large number of single-antenna IoT devices.
We propose a novel algorithm unrolling framework based on the deep neural network to simultaneously achieve low computational complexity and high robustness.
arXiv Detail & Related papers (2021-06-19T05:23:05Z) - Learning Autonomy in Management of Wireless Random Networks [102.02142856863563]
This paper presents a machine learning strategy that tackles a distributed optimization task in a wireless network with an arbitrary number of randomly interconnected nodes.
We develop a flexible deep neural network formalism termed distributed message-passing neural network (DMPNN) with forward and backward computations independent of the network topology.
arXiv Detail & Related papers (2021-06-15T09:03:28Z) - Joint Coding and Scheduling Optimization for Distributed Learning over
Wireless Edge Networks [21.422040036286536]
This article addresses problems by leveraging recent advances in coded computing and the deep dueling neural network architecture.
By introducing coded structures/redundancy, a distributed learning task can be completed without waiting for straggling nodes.
Simulations show that the proposed framework reduces the average learning delay in wireless edge computing up to 66% compared with other DL approaches.
arXiv Detail & Related papers (2021-03-07T08:57:09Z) - All at Once Network Quantization via Collaborative Knowledge Transfer [56.95849086170461]
We develop a novel collaborative knowledge transfer approach for efficiently training the all-at-once quantization network.
Specifically, we propose an adaptive selection strategy to choose a high-precision enquoteteacher for transferring knowledge to the low-precision student.
To effectively transfer knowledge, we develop a dynamic block swapping method by randomly replacing the blocks in the lower-precision student network with the corresponding blocks in the higher-precision teacher network.
arXiv Detail & Related papers (2021-03-02T03:09:03Z) - Encoding the latent posterior of Bayesian Neural Networks for
uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures.
We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer.
Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z) - Deep Multi-Task Learning for Cooperative NOMA: System Design and
Principles [52.79089414630366]
We develop a novel deep cooperative NOMA scheme, drawing upon the recent advances in deep learning (DL)
We develop a novel hybrid-cascaded deep neural network (DNN) architecture such that the entire system can be optimized in a holistic manner.
arXiv Detail & Related papers (2020-07-27T12:38:37Z) - Binary Neural Networks: A Survey [126.67799882857656]
The binary neural network serves as a promising technique for deploying deep models on resource-limited devices.
The binarization inevitably causes severe information loss, and even worse, its discontinuity brings difficulty to the optimization of the deep network.
We present a survey of these algorithms, mainly categorized into the native solutions directly conducting binarization, and the optimized ones using techniques like minimizing the quantization error, improving the network loss function, and reducing the gradient error.
arXiv Detail & Related papers (2020-03-31T16:47:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.