Related papers: Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients

Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients

URL: http://arxiv.org/abs/2406.01439v2
Date: Thu, 20 Jun 2024 12:04:28 GMT
Title: Asynchronous Multi-Server Federated Learning for Geo-Distributed Clients
Authors: Yuncong Zuo, Bart Cox, Lydia Y. Chen, Jérémie Decouchant,
Abstract summary: Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. We propose a new FL architecture that is entirely asynchronous, and therefore addresses these two limitations simultaneously.
Score: 4.6792910030704515
License: http://creativecommons.org/licenses/by-sa/4.0/
Abstract: Federated learning (FL) systems enable multiple clients to train a machine learning model iteratively through synchronously exchanging the intermediate model weights with a single server. The scalability of such FL systems can be limited by two factors: server idle time due to synchronous communication and the risk of a single server becoming the bottleneck. In this paper, we propose a new FL architecture, to our knowledge, the first multi-server FL system that is entirely asynchronous, and therefore addresses these two limitations simultaneously. Our solution keeps both servers and clients continuously active. As in previous multi-server methods, clients interact solely with their nearest server, ensuring efficient update integration into the model. Differently, however, servers also periodically update each other asynchronously, and never postpone interactions with clients. We compare our solution to three representative baselines - FedAvg, FedAsync and HierFAVG - on the MNIST and CIFAR-10 image classification datasets and on the WikiText-2 language modeling dataset. Our solution converges to similar or higher accuracy levels than previous baselines and requires 61% less time to do so in geo-distributed settings.

Related papers

FRAIN to Train: A Fast-and-Reliable Solution for Decentralized Federated Learning [1.1510009152620668]
asynchronous learning (FL) enables collaborative model training across distributed clients while preserving data locality.<n>We introduce Fast-and-Reliable AI Network, FRAIN, a new FL method that mitigates these limitations by incorporating two key ideas.<n>Experiments with a CNN image-classification model and a Transformer-based language model demonstrate that FRAIN achieves more stable and robust convergence than FedAvg, FedAsync, and BRAIN.
arXiv Detail & Related papers (2025-05-07T08:20:23Z)
FedMerge: Federated Personalization via Model Merging [51.12769696559237]
One global model might not be sufficient to serve many clients with non-IID tasks and distributions. We propose a novel FedMerge'' approach that can create a personalized model per client by simply merging multiple global models. We evaluate FedMerge on three different non-IID settings applied to different domains with diverse tasks and data types.
arXiv Detail & Related papers (2025-04-09T10:44:14Z)
Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for Federated Continual Learning [49.508844889242425]
We propose a novel server-side FCL pattern in medical domain, Dynamic Allocation Hypernetwork with adaptive model recalibration (FedDAH) FedDAH is designed to facilitate collaborative learning under the distinct and dynamic task streams across clients. For the biased optimization, we introduce a novel adaptive model recalibration (AMR) to incorporate the candidate changes of historical models into current server updates.
arXiv Detail & Related papers (2025-03-25T00:17:47Z)
Dynamic Allocation Hypernetwork with Adaptive Model Recalibration for FCL [49.508844889242425]
We propose a novel server-side FCL pattern in medical domain, Dynamic Allocation Hypernetwork with adaptive model recalibration (textbfFedDAH) For the biased optimization, we introduce a novel adaptive model recalibration (AMR) to incorporate the candidate changes of historical models into current server updates. Experiments on the AMOS dataset demonstrate the superiority of our FedDAH to other FCL methods on sites with different task streams.
arXiv Detail & Related papers (2025-03-23T13:12:56Z)
Asynchronous Byzantine Federated Learning [4.6792910030704515]
Federated learning (FL) enables a set of geographically distributed clients to collectively train a model through a server. Our solution is one of the first Byzantine-resilient and asynchronous FL algorithms. We compare the performance of our solution with state-of-the-art algorithms on both image and text datasets.
arXiv Detail & Related papers (2024-06-03T15:29:38Z)
FedAST: Federated Asynchronous Simultaneous Training [27.492821176616815]
Federated Learning (FL) enables devices or clients to collaboratively train machine learning (ML) models without sharing their private data. Much of the existing work in FL focuses on efficiently learning a model for a single task. In this paper, we propose simultaneous training of multiple FL models using a common set of datasets.
arXiv Detail & Related papers (2024-06-01T05:14:20Z)
Communication Efficient ConFederated Learning: An Event-Triggered SAGA Approach [67.27031215756121]
Federated learning (FL) is a machine learning paradigm that targets model training without gathering the local data over various data sources. Standard FL, which employs a single server, can only support a limited number of users, leading to degraded learning capability. In this work, we consider a multi-server FL framework, referred to as emphConfederated Learning (CFL) in order to accommodate a larger number of users.
arXiv Detail & Related papers (2024-02-28T03:27:10Z)
Timely Asynchronous Hierarchical Federated Learning: Age of Convergence [59.96266198512243]
We consider an asynchronous hierarchical federated learning setting with a client-edge-cloud framework. The clients exchange the trained parameters with their corresponding edge servers, which update the locally aggregated model. The goal of each client is to converge to the global model, while maintaining timeliness of the clients.
arXiv Detail & Related papers (2023-06-21T17:39:16Z)
Scalable Collaborative Learning via Representation Sharing [53.047460465980144]
Federated learning (FL) and Split Learning (SL) are two frameworks that enable collaborative learning while keeping the data private (on device) In FL, each data holder trains a model locally and releases it to a central server for aggregation. In SL, the clients must release individual cut-layer activations (smashed data) to the server and wait for its response (during both inference and back propagation). In this work, we present a novel approach for privacy-preserving machine learning, where the clients collaborate via online knowledge distillation using a contrastive loss.
arXiv Detail & Related papers (2022-11-20T10:49:22Z)
Optimizing Server-side Aggregation For Robust Federated Learning via Subspace Training [80.03567604524268]
Non-IID data distribution across clients and poisoning attacks are two main challenges in real-world federated learning systems. We propose SmartFL, a generic approach that optimize the server-side aggregation process. We provide theoretical analyses of the convergence and generalization capacity for SmartFL.
arXiv Detail & Related papers (2022-11-10T13:20:56Z)
Latency Aware Semi-synchronous Client Selection and Model Aggregation for Wireless Federated Learning [0.6882042556551609]
Federated learning (FL) is a collaborative machine learning framework that requires different clients (e.g., Internet of Things devices) to participate in the machine learning model training process. Traditional FL process may suffer from the straggler problem in heterogeneous client settings. We propose a Semisynchronous-client Selection and mOdel aggregation aggregation for federated learNing (LESSON) method that allows all the clients to participate in the whole FL process but with different frequencies.
arXiv Detail & Related papers (2022-10-19T05:59:22Z)
A Bayesian Federated Learning Framework with Online Laplace Approximation [144.7345013348257]
Federated learning allows multiple clients to collaboratively learn a globally shared model. We propose a novel FL framework that uses online Laplace approximation to approximate posteriors on both the client and server side. We achieve state-of-the-art results on several benchmarks, clearly demonstrating the advantages of the proposed method.
arXiv Detail & Related papers (2021-02-03T08:36:58Z)
Timely Communication in Federated Learning [65.1253801733098]
We consider a global learning framework in which a parameter server (PS) trains a global model by using $n$ clients without actually storing the client data centrally at a cloud server. Under the proposed scheme, at each iteration, the PS waits for $m$ available clients and sends them the current model. We find the average age of information experienced by each client and numerically characterize the age-optimal $m$ and $k$ values for a given $n$.
arXiv Detail & Related papers (2020-12-31T18:52:08Z)
Flow-FL: Data-Driven Federated Learning for Spatio-Temporal Predictions in Multi-Robot Systems [16.887485428725043]
We show how the Federated Learning framework enables learning collectively from distributed data in connected robot teams. This framework typically works with clients collecting data locally, updating neural network weights of their model, and sending updates to a server for aggregation into a global model.
arXiv Detail & Related papers (2020-10-16T19:09:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.