Delayed Momentum Aggregation: Communication-efficient Byzantine-robust Federated Learning with Partial Participation
- URL: http://arxiv.org/abs/2509.02970v1
- Date: Wed, 03 Sep 2025 03:14:58 GMT
- Title: Delayed Momentum Aggregation: Communication-efficient Byzantine-robust Federated Learning with Partial Participation
- Authors: Kaoru Otsuka, Yuki Takezawa, Makoto Yamada,
- Abstract summary: Federated Learning (FL) allows distributed model training across multiple clients while preserving data privacy.<n>Current Byzantine-robust FL methods assume full client participation, which is unrealistic due to communication constraints and client availability.<n>We introduce delayed momentum aggregation, a novel principle where the server aggregates the most recently received momentum from non-participating clients.<n>Experiments on deep learning tasks validated our theoretical findings, showing stable and robust training under various Byzantine attacks.
- Score: 18.755022483546323
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) allows distributed model training across multiple clients while preserving data privacy, but it remains vulnerable to Byzantine clients that exhibit malicious behavior. While existing Byzantine-robust FL methods provide strong convergence guarantees (e.g., to a stationary point in expectation) under Byzantine attacks, they typically assume full client participation, which is unrealistic due to communication constraints and client availability. Under partial participation, existing methods fail immediately after the sampled clients contain a Byzantine majority, creating a fundamental challenge for sparse communication. First, we introduce delayed momentum aggregation, a novel principle where the server aggregates the most recently received gradients from non-participating clients alongside fresh momentum from active clients. Our optimizer D-Byz-SGDM (Delayed Byzantine-robust SGD with Momentum) implements this delayed momentum aggregation principle for Byzantine-robust FL with partial participation. Then, we establish convergence guarantees that recover previous full participation results and match the fundamental lower bounds we prove for the partial participation setting. Experiments on deep learning tasks validated our theoretical findings, showing stable and robust training under various Byzantine attacks.
Related papers
- First Provable Guarantees for Practical Private FL: Beyond Restrictive Assumptions [52.82254388526969]
Fed-$$-NormEC is the first differentially private FL framework providing provable convergence and DP guarantees under standard assumptions.<n>Fed-$$-NormE integrates local updates, separate server and client stepsizes, and, crucially, partial client participation.
arXiv Detail & Related papers (2025-12-25T06:05:15Z) - MURIM: Multidimensional Reputation-based Incentive Mechanism for Federated Learning [3.8054072718666574]
Federated Learning (FL) has emerged as a leading privacy-preserving machine learning paradigm.<n>FL continues to face key challenges, including weak client incentives, privacy risks, and resource constraints.<n>We propose MURIM, a Reputation-based Incentive Mechanism that jointly considers client reliability, privacy, resource capacity, and fairness.
arXiv Detail & Related papers (2025-12-15T23:18:32Z) - Stragglers Can Contribute More: Uncertainty-Aware Distillation for Asynchronous Federated Learning [61.249748418757946]
Asynchronous federated learning (FL) has recently gained attention for its enhanced efficiency and scalability.<n>We propose FedEcho, a novel framework that incorporates uncertainty-aware distillation to enhance the asynchronous FL performances.<n>We demonstrate that FedEcho consistently outperforms existing asynchronous federated learning baselines.
arXiv Detail & Related papers (2025-11-25T06:25:25Z) - FedGreed: A Byzantine-Robust Loss-Based Aggregation Method for Federated Learning [1.3853653640712935]
Federated Learning (FL) enables collaborative model training across multiple clients while preserving data privacy by keeping local datasets on-device.<n>In this work, we address FL settings where clients may behave adversarially, exhibiting Byzantine attacks, while the central server is trusted and equipped with a reference dataset.<n>We propose FedGreed, a resilient aggregation strategy for federated learning that does not require any assumptions about the fraction of adversarial participants.
arXiv Detail & Related papers (2025-08-25T14:20:19Z) - Byzantine-Robust Federated Learning Using Generative Adversarial Networks [1.4091801425319963]
Federated learning (FL) enables collaborative model training across distributed clients without sharing raw data, but its robustness is threatened by Byzantine behaviors such as data and model poisoning.<n>We present a defense framework that addresses these challenges by leveraging a conditional generative adversarial network (cGAN) at the server to synthesize representative data for validating client updates.<n>This approach eliminates reliance on external datasets, adapts to diverse attack strategies, and integrates seamlessly into standard FL.
arXiv Detail & Related papers (2025-03-26T18:00:56Z) - Rethinking Byzantine Robustness in Federated Recommendation from Sparse Aggregation Perspective [65.65471972217814]
federated recommendation (FR) based on federated learning (FL) emerges, keeping the personal data on the local client and updating a model collaboratively.<n>FR has a unique sparse aggregation mechanism, where the embedding of each item is updated by only partial clients, instead of full clients in a dense aggregation of general FL.<n>In this paper, we reformulate the Byzantine robustness under sparse aggregation by defining the aggregation for a single item as the smallest execution unit.<n>We propose a family of effective attack strategies, named Spattack, which exploit the vulnerability in sparse aggregation and are categorized along the adversary's knowledge and capability.
arXiv Detail & Related papers (2025-01-06T15:19:26Z) - Relaxed Contrastive Learning for Federated Learning [48.96253206661268]
We propose a novel contrastive learning framework to address the challenges of data heterogeneity in federated learning.
Our framework outperforms all existing federated learning approaches by huge margins on the standard benchmarks.
arXiv Detail & Related papers (2024-01-10T04:55:24Z) - Byzantine Robustness and Partial Participation Can Be Achieved at Once: Just Clip Gradient Differences [61.74021364776313]
Distributed learning has emerged as a leading paradigm for training large machine learning models.
In real-world scenarios, participants may be unreliable or malicious, posing a significant challenge to the integrity and accuracy of the trained models.
We propose the first distributed method with client sampling and provable tolerance to Byzantine workers.
arXiv Detail & Related papers (2023-11-23T17:50:30Z) - Robust Federated Learning via Over-The-Air Computation [48.47690125123958]
Simple averaging of model updates via over-the-air computation makes the learning task vulnerable to random or intended modifications of the local model updates of some malicious clients.
We propose a robust transmission and aggregation framework to such attacks while preserving the benefits of over-the-air computation for federated learning.
arXiv Detail & Related papers (2021-11-01T19:21:21Z) - Learning from History for Byzantine Robust Optimization [52.68913869776858]
Byzantine robustness has received significant attention recently given its importance for distributed learning.
We show that most existing robust aggregation rules may not converge even in the absence of any Byzantine attackers.
arXiv Detail & Related papers (2020-12-18T16:22:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.