Related papers: AFBS:Buffer Gradient Selection in Semi-asynchronous Federated Learning

AFBS:Buffer Gradient Selection in Semi-asynchronous Federated Learning

URL: http://arxiv.org/abs/2506.12754v2
Date: Mon, 23 Jun 2025 05:27:00 GMT
Title: AFBS:Buffer Gradient Selection in Semi-asynchronous Federated Learning
Authors: Chaoyi Lu, Yiding Sun, Jinqian Chen, Zhichuan Yang, Jiangming Pan, Jihua Zhu,
Abstract summary: Asynchronous federated learning (AFL) accelerates training by eliminating the need to wait for stragglers.<n>Existing solutions address this issue with gradient buffers, forming a semi-asynchronous framework.<n>We propose AFBS (Asynchronous FL Buffer Selection), the first algorithm to perform gradient selection within buffers while ensuring privacy protection.
Score: 11.478349728899705
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Asynchronous federated learning (AFL) accelerates training by eliminating the need to wait for stragglers, but its asynchronous nature introduces gradient staleness, where outdated gradients degrade performance. Existing solutions address this issue with gradient buffers, forming a semi-asynchronous framework. However, this approach struggles when buffers accumulate numerous stale gradients, as blindly aggregating all gradients can harm training. To address this, we propose AFBS (Asynchronous FL Buffer Selection), the first algorithm to perform gradient selection within buffers while ensuring privacy protection. Specifically, the client sends the random projection encrypted label distribution matrix before training, and the server performs client clustering based on it. During training, server scores and selects gradients within each cluster based on their informational value, discarding low-value gradients to enhance semi-asynchronous federated learning. Extensive experiments in highly heterogeneous system and data environments demonstrate AFBS's superior performance compared to state-of-the-art methods. Notably, on the most challenging task, CIFAR-100, AFBS improves accuracy by up to 4.8% over the previous best algorithm and reduces the time to reach target accuracy by 75%.

Related papers

Adaptive Deadline and Batch Layered Synchronized Federated Learning [66.93447103966439]
Federated learning (FL) enables collaborative model training across distributed edge devices while preserving data privacy, and typically operates in a round-based synchronous manner.<n>We propose ADEL-FL, a novel framework that jointly optimize per-round deadlines and user-specific batch sizes for layer-wise aggregation.
arXiv Detail & Related papers (2025-05-29T19:59:18Z)
Buffer-based Gradient Projection for Continual Federated Learning [16.879024856283323]
Fed-A-GEM mitigates catastrophic forgetting by leveraging local buffer samples and aggregated buffer gradients. Our experiments on standard benchmarks show consistent performance improvements across diverse scenarios.
arXiv Detail & Related papers (2024-09-03T03:50:19Z)
Asynchronous Federated Stochastic Optimization for Heterogeneous Objectives Under Arbitrary Delays [0.0]
Federated learning (FL) was recently proposed to securely train models with data held over multiple locations ("clients") Two major challenges hindering the performance of FL algorithms are long training times caused by straggling clients, and a decline in model accuracy under non-iid local data distributions ("client drift") We propose and analyze Asynchronous Exact Averaging (AREA), a new (sub)gradient algorithm that utilizes communication to speed up convergence and enhance scalability, and employs client memory to correct the client drift caused by variations in client update frequencies.
arXiv Detail & Related papers (2024-05-16T14:22:49Z)
Provably Personalized and Robust Federated Learning [47.50663360022456]
We propose simple algorithms which identify clusters of similar clients and train a personalized modelper-cluster. The convergence rates of our algorithmsally match those obtained if we knew the true underlying clustering of the clients and are provably robust in the Byzantine setting.
arXiv Detail & Related papers (2023-06-14T09:37:39Z)
Unbounded Gradients in Federated Leaning with Buffered Asynchronous Aggregation [0.6526824510982799]
The textitFedBuff algorithm allows asynchronous updates while preserving privacy via secure aggregation. This paper presents a theoretical analysis of the convergence rate of this algorithm when heterogeneity in data, batch size, and delay are considered.
arXiv Detail & Related papers (2022-10-03T18:20:48Z)
Semi-Synchronous Personalized Federated Learning over Mobile Edge Networks [88.50555581186799]
We propose a semi-synchronous PFL algorithm, termed as Semi-Synchronous Personalized FederatedAveraging (PerFedS$2$), over mobile edge networks. We derive an upper bound of the convergence rate of PerFedS2 in terms of the number of participants per global round and the number of rounds. Experimental results verify the effectiveness of PerFedS2 in saving training time as well as guaranteeing the convergence of training loss.
arXiv Detail & Related papers (2022-09-27T02:12:43Z)
TCT: Convexifying Federated Learning using Bootstrapped Neural Tangent Kernels [141.29156234353133]
State-of-the-art convex learning methods can perform far worse than their centralized counterparts when clients have dissimilar data distributions. We show this disparity can largely be attributed to challenges presented by non-NISTity. We propose a Train-Convexify neural network (TCT) procedure to sidestep this issue.
arXiv Detail & Related papers (2022-07-13T16:58:22Z)
GBA: A Tuning-free Approach to Switch between Synchronous and Asynchronous Training for Recommendation Model [19.65557684234458]
We propose Global Batch gradients Aggregation (GBA) over parameter server (PS) A token-control process is implemented to assemble the gradients and decay the gradients with severe staleness. Experiments on three industrial-scale recommendation tasks show that GBA is an effective tuning-free approach for switching.
arXiv Detail & Related papers (2022-05-23T05:22:42Z)
Killing Two Birds with One Stone:Efficient and Robust Training of Face Recognition CNNs by Partial FC [66.71660672526349]
We propose a sparsely updating variant of the Fully Connected (FC) layer, named Partial FC (PFC) In each iteration, positive class centers and a random subset of negative class centers are selected to compute the margin-based softmax loss. The computing requirement, the probability of inter-class conflict, and the frequency of passive update on tail class centers, are dramatically reduced.
arXiv Detail & Related papers (2022-03-28T14:33:21Z)
Acceleration of Federated Learning with Alleviated Forgetting in Local Training [61.231021417674235]
Federated learning (FL) enables distributed optimization of machine learning models while protecting privacy. We propose FedReg, an algorithm to accelerate FL with alleviated knowledge forgetting in the local training stage. Our experiments demonstrate that FedReg not only significantly improves the convergence rate of FL, especially when the neural network architecture is deep.
arXiv Detail & Related papers (2022-03-05T02:31:32Z)
Speeding up Heterogeneous Federated Learning with Sequentially Trained Superclients [19.496278017418113]
Federated Learning (FL) allows training machine learning models in privacy-constrained scenarios by enabling the cooperation of edge devices without requiring local data sharing. This approach raises several challenges due to the different statistical distribution of the local datasets and the clients' computational heterogeneity. We propose FedSeq, a novel framework leveraging the sequential training of subgroups of heterogeneous clients, i.e. superclients, to emulate the centralized paradigm in a privacy-compliant way.
arXiv Detail & Related papers (2022-01-26T12:33:23Z)
Sparse Communication for Training Deep Networks [56.441077560085475]
Synchronous gradient descent (SGD) is the most common method used for distributed training of deep learning models. In this algorithm, each worker shares its local gradients with others and updates the parameters using the average gradients of all workers. We study several compression schemes and identify how three key parameters affect the performance.
arXiv Detail & Related papers (2020-09-19T17:28:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.