Multi-Task Federated Learning for Personalised Deep Neural Networks in
Edge Computing
- URL: http://arxiv.org/abs/2007.09236v3
- Date: Thu, 22 Jul 2021 09:42:08 GMT
- Title: Multi-Task Federated Learning for Personalised Deep Neural Networks in
Edge Computing
- Authors: Jed Mills, Jia Hu, Geyong Min
- Abstract summary: Federated Learning (FL) is an emerging approach for collaboratively training Deep Neural Networks (DNNs) on mobile devices.
Previous works have shown that non-Independent and Identically Distributed (non-IID) user data harms the convergence speed of the FL algorithms.
We propose a Multi-Task FL (MTFL) algorithm that introduces non-federated Batch-Normalization layers into the federated DNN.
- Score: 23.447883712141422
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) is an emerging approach for collaboratively training
Deep Neural Networks (DNNs) on mobile devices, without private user data
leaving the devices. Previous works have shown that non-Independent and
Identically Distributed (non-IID) user data harms the convergence speed of the
FL algorithms. Furthermore, most existing work on FL measures global-model
accuracy, but in many cases, such as user content-recommendation, improving
individual User model Accuracy (UA) is the real objective. To address these
issues, we propose a Multi-Task FL (MTFL) algorithm that introduces
non-federated Batch-Normalization (BN) layers into the federated DNN. MTFL
benefits UA and convergence speed by allowing users to train models
personalised to their own data. MTFL is compatible with popular iterative FL
optimisation algorithms such as Federated Averaging (FedAvg), and we show
empirically that a distributed form of Adam optimisation (FedAvg-Adam) benefits
convergence speed even further when used as the optimisation strategy within
MTFL. Experiments using MNIST and CIFAR10 demonstrate that MTFL is able to
significantly reduce the number of rounds required to reach a target UA, by up
to $5\times$ when using existing FL optimisation strategies, and with a further
$3\times$ improvement when using FedAvg-Adam. We compare MTFL to competing
personalised FL algorithms, showing that it is able to achieve the best UA for
MNIST and CIFAR10 in all considered scenarios. Finally, we evaluate MTFL with
FedAvg-Adam on an edge-computing testbed, showing that its convergence and UA
benefits outweigh its overhead.
Related papers
- Joint Energy and Latency Optimization in Federated Learning over Cell-Free Massive MIMO Networks [36.6868658064971]
Federated learning (FL) is a distributed learning paradigm wherein users exchange FL models with a server instead of raw datasets.
Cell-free massive multiple-input multiple-output(CFmMIMO) is a promising architecture for implementing FL because it serves many users on the same time/frequency resources.
We propose an uplink power allocation scheme in FL over CFmMIMO by considering the effect of each user's power on the energy and latency of other users.
arXiv Detail & Related papers (2024-04-28T19:24:58Z) - Semi-Federated Learning: Convergence Analysis and Optimization of A
Hybrid Learning Framework [70.83511997272457]
We propose a semi-federated learning (SemiFL) paradigm to leverage both the base station (BS) and devices for a hybrid implementation of centralized learning (CL) and FL.
We propose a two-stage algorithm to solve this intractable problem, in which we provide the closed-form solutions to the beamformers.
arXiv Detail & Related papers (2023-10-04T03:32:39Z) - Adaptive Model Pruning and Personalization for Federated Learning over
Wireless Networks [72.59891661768177]
Federated learning (FL) enables distributed learning across edge devices while protecting data privacy.
We consider a FL framework with partial model pruning and personalization to overcome these challenges.
This framework splits the learning model into a global part with model pruning shared with all devices to learn data representations and a personalized part to be fine-tuned for a specific device.
arXiv Detail & Related papers (2023-09-04T21:10:45Z) - Joint Age-based Client Selection and Resource Allocation for
Communication-Efficient Federated Learning over NOMA Networks [8.030674576024952]
In federated learning (FL), distributed clients can collaboratively train a shared global model while retaining their own training data locally.
In this paper, a joint optimization problem of client selection and resource allocation is formulated, aiming to minimize the total time consumption of each round in FL over a non-orthogonal multiple access (NOMA) enabled wireless network.
In addition, a server-side artificial neural network (ANN) is proposed to predict the FL models of clients who are not selected at each round to further improve FL performance.
arXiv Detail & Related papers (2023-04-18T13:58:16Z) - Automated Federated Learning in Mobile Edge Networks -- Fast Adaptation
and Convergence [83.58839320635956]
Federated Learning (FL) can be used in mobile edge networks to train machine learning models in a distributed manner.
Recent FL has been interpreted within a Model-Agnostic Meta-Learning (MAML) framework, which brings FL significant advantages in fast adaptation and convergence over heterogeneous datasets.
This paper addresses how much benefit MAML brings to FL and how to maximize such benefit over mobile edge networks.
arXiv Detail & Related papers (2023-03-23T02:42:10Z) - Sparse Federated Learning with Hierarchical Personalized Models [24.763028713043468]
Federated learning (FL) can achieve privacy-safe and reliable collaborative training without collecting users' private data.
We propose a personalized FL algorithm using a hierarchical proximal mapping based on the moreau envelop, named sparse federated learning with hierarchical personalized models (sFedHP)
A continuously differentiable approximated L1-norm is also used as the sparse constraint to reduce the communication cost.
arXiv Detail & Related papers (2022-03-25T09:06:42Z) - Achieving Personalized Federated Learning with Sparse Local Models [75.76854544460981]
Federated learning (FL) is vulnerable to heterogeneously distributed data.
To counter this issue, personalized FL (PFL) was proposed to produce dedicated local models for each individual user.
Existing PFL solutions either demonstrate unsatisfactory generalization towards different model architectures or cost enormous extra computation and memory.
We proposeFedSpa, a novel PFL scheme that employs personalized sparse masks to customize sparse local models on the edge.
arXiv Detail & Related papers (2022-01-27T08:43:11Z) - Joint Superposition Coding and Training for Federated Learning over
Multi-Width Neural Networks [52.93232352968347]
This paper aims to integrate two synergetic technologies, federated learning (FL) and width-adjustable slimmable neural network (SNN)
FL preserves data privacy by exchanging the locally trained models of mobile devices. SNNs are however non-trivial, particularly under wireless connections with time-varying channel conditions.
We propose a communication and energy-efficient SNN-based FL (named SlimFL) that jointly utilizes superposition coding (SC) for global model aggregation and superposition training (ST) for updating local models.
arXiv Detail & Related papers (2021-12-05T11:17:17Z) - FedFog: Network-Aware Optimization of Federated Learning over Wireless
Fog-Cloud Systems [40.421253127588244]
Federated learning (FL) is capable of performing large distributed machine learning tasks across multiple edge users by periodically aggregating trained local parameters.
We first propose an efficient FL algorithm (called FedFog) to perform the local aggregation of gradient parameters at fog servers and global training update at the cloud.
arXiv Detail & Related papers (2021-07-04T08:03:15Z) - Convergence Time Optimization for Federated Learning over Wireless
Networks [160.82696473996566]
A wireless network is considered in which wireless users transmit their local FL models (trained using their locally collected data) to a base station (BS)
The BS, acting as a central controller, generates a global FL model using the received local FL models and broadcasts it back to all users.
Due to the limited number of resource blocks (RBs) in a wireless network, only a subset of users can be selected to transmit their local FL model parameters to the BS.
Since each user has unique training data samples, the BS prefers to include all local user FL models to generate a converged global FL model.
arXiv Detail & Related papers (2020-01-22T01:55:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.