Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning
- URL: http://arxiv.org/abs/2003.04544v3
- Date: Tue, 30 Jun 2020 01:47:27 GMT
- Title: Joint Parameter-and-Bandwidth Allocation for Improving the Efficiency of
Partitioned Edge Learning
- Authors: Dingzhu Wen, Mehdi Bennis, and Kaibin Huang
- Abstract summary: Machine learning algorithms are deployed at the network edge for training artificial intelligence (AI) models.
This paper focuses on the novel joint design of parameter (computation load) allocation and bandwidth allocation.
- Score: 73.82875010696849
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To leverage data and computation capabilities of mobile devices, machine
learning algorithms are deployed at the network edge for training artificial
intelligence (AI) models, resulting in the new paradigm of edge learning. In
this paper, we consider the framework of partitioned edge learning for
iteratively training a large-scale model using many resource-constrained
devices (called workers). To this end, in each iteration, the model is
dynamically partitioned into parametric blocks, which are downloaded to worker
groups for updating using data subsets. Then, the local updates are uploaded to
and cascaded by the server for updating a global model. To reduce resource
usage by minimizing the total learning-and-communication latency, this work
focuses on the novel joint design of parameter (computation load) allocation
and bandwidth allocation (for downloading and uploading). Two design approaches
are adopted. First, a practical sequential approach, called partially
integrated parameter-and-bandwidth allocation (PABA), yields two schemes,
namely bandwidth aware parameter allocation and parameter aware bandwidth
allocation. The former minimizes the load for the slowest (in computing) of
worker groups, each training a same parametric block. The latter allocates the
largest bandwidth to the worker being the latency bottleneck. Second, PABA are
jointly optimized. Despite its being a nonconvex problem, an efficient and
optimal solution algorithm is derived by intelligently nesting a bisection
search and solving a convex problem. Experimental results using real data
demonstrate that integrating PABA can substantially improve the performance of
partitioned edge learning in terms of latency (by e.g., 46%) and accuracy (by
e.g., 4%).
Related papers
- Split Federated Learning Over Heterogeneous Edge Devices: Algorithm and Optimization [7.013344179232109]
Split Learning (SL) is a promising collaborative machine learning approach, enabling resource-constrained devices to train models without sharing raw data.
Current SL algorithms face limitations in training efficiency and suffer from prolonged latency.
We propose the Heterogeneous Split Federated Learning framework, which allows resource-constrained clients to train their personalized client-side models in parallel.
arXiv Detail & Related papers (2024-11-21T07:46:01Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - A Multi-Head Ensemble Multi-Task Learning Approach for Dynamical
Computation Offloading [62.34538208323411]
We propose a multi-head ensemble multi-task learning (MEMTL) approach with a shared backbone and multiple prediction heads (PHs)
MEMTL outperforms benchmark methods in both the inference accuracy and mean square error without requiring additional training data.
arXiv Detail & Related papers (2023-09-02T11:01:16Z) - Gradient Sparsification for Efficient Wireless Federated Learning with
Differential Privacy [25.763777765222358]
Federated learning (FL) enables distributed clients to collaboratively train a machine learning model without sharing raw data with each other.
As the model size grows, the training latency due to limited transmission bandwidth and private information degrades while using differential privacy (DP) protection.
We propose sparsification empowered FL framework wireless channels, in over to improve training efficiency without sacrificing convergence performance.
arXiv Detail & Related papers (2023-04-09T05:21:15Z) - Time Minimization in Hierarchical Federated Learning [11.678121177730718]
Federated learning is a modern decentralized machine learning technique where user equipments perform machine learning tasks locally and then upload the model parameters to a central server.
In this paper, we consider a 3-layer hierarchical federated learning system which involves model parameter exchanges between the cloud and edge servers.
arXiv Detail & Related papers (2022-10-07T13:53:20Z) - Latency Optimization for Blockchain-Empowered Federated Learning in
Multi-Server Edge Computing [24.505675843652448]
In this paper, we study a new latency optimization problem for federated learning (BFL) in multi-server edge computing.
In this system model, distributed mobile devices (MDs) communicate with a set of edge servers (ESs) to handle both machine learning (ML) model training and block mining simultaneously.
arXiv Detail & Related papers (2022-03-18T00:38:29Z) - Adaptive Subcarrier, Parameter, and Power Allocation for Partitioned
Edge Learning Over Broadband Channels [69.18343801164741]
partitioned edge learning (PARTEL) implements parameter-server training, a well known distributed learning method, in wireless network.
We consider the case of deep neural network (DNN) models which can be trained using PARTEL by introducing some auxiliary variables.
arXiv Detail & Related papers (2020-10-08T15:27:50Z) - Coded Stochastic ADMM for Decentralized Consensus Optimization with Edge
Computing [113.52575069030192]
Big data, including applications with high security requirements, are often collected and stored on multiple heterogeneous devices, such as mobile devices, drones and vehicles.
Due to the limitations of communication costs and security requirements, it is of paramount importance to extract information in a decentralized manner instead of aggregating data to a fusion center.
We consider the problem of learning model parameters in a multi-agent system with data locally processed via distributed edge nodes.
A class of mini-batch alternating direction method of multipliers (ADMM) algorithms is explored to develop the distributed learning model.
arXiv Detail & Related papers (2020-10-02T10:41:59Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.