Related papers: Trade-off in Estimating the Number of Byzantine Clients in Federated Learning

Trade-off in Estimating the Number of Byzantine Clients in Federated Learning

URL: http://arxiv.org/abs/2510.04432v1
Date: Mon, 06 Oct 2025 02:01:56 GMT
Title: Trade-off in Estimating the Number of Byzantine Clients in Federated Learning
Authors: Ziyi Chen, Su Zhang, Heng Huang,
Abstract summary: Federated learning is vulnerable to Byzantine clients that can send any erroneous signals.<n>We show that underestimation ($hatfge f$) can lead to arbitrarily poor performance for both aggregators and federated learning.<n>All these optimal bounds are proportional to $hatf/(n-f-hatf)$ with $n$ clients, which monotonically increases with larger $hatf$.
Score: 51.015234193544636
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Federated learning has attracted increasing attention at recent large-scale optimization and machine learning research and applications, but is also vulnerable to Byzantine clients that can send any erroneous signals. Robust aggregators are commonly used to resist Byzantine clients. This usually requires to estimate the unknown number $f$ of Byzantine clients, and thus accordingly select the aggregators with proper degree of robustness (i.e., the maximum number $\hat{f}$ of Byzantine clients allowed by the aggregator). Such an estimation should have important effect on the performance, which has not been systematically studied to our knowledge. This work will fill in the gap by theoretically analyzing the worst-case error of aggregators as well as its induced federated learning algorithm for any cases of $\hat{f}$ and $f$. Specifically, we will show that underestimation ($\hat{f}<f$) can lead to arbitrarily poor performance for both aggregators and federated learning. For non-underestimation ($\hat{f}\ge f$), we have proved optimal lower and upper bounds of the same order on the errors of both aggregators and federated learning. All these optimal bounds are proportional to $\hat{f}/(n-f-\hat{f})$ with $n$ clients, which monotonically increases with larger $\hat{f}$. This indicates a fundamental trade-off: while an aggregator with a larger robustness degree $\hat{f}$ can solve federated learning problems of wider range $f\in [0,\hat{f}]$, the performance can deteriorate when there are actually fewer or even no Byzantine clients (i.e., $f\in [0,\hat{f})$).

Related papers

Proving the Limited Scalability of Centralized Distributed Optimization via a New Lower Bound Construction [57.93371273485736]
We consider a centralized distributed learning setup where all workers jointly find an unbiased bound LDeltaepsilon2,$ better poly-logarithmically in $n$, even in the homogeneous (i.i.d.) case, where all workers access the same distribution.
arXiv Detail & Related papers (2025-06-30T13:27:39Z)
Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning [19.624245500772027]
A robust distributed learning algorithm aims to maintain reliable performance despite the presence of misbehaving workers.<n>Such misbehaviors are commonly modeled as $textitByzantine failures$, allowing arbitrarily corrupted communication, or as $textitdata poisoning$, a weaker form of corruption restricted to local training data.<n>We show for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning.
arXiv Detail & Related papers (2025-06-22T12:59:15Z)
Centroid Approximation for Byzantine-Tolerant Federated Learning [11.477670199123335]
Federated learning allows each client to keep its data locally when training machine learning models in a distributed setting.<n>We show that the various validity conditions alone do not guarantee a good approximation of the average.<n>We present a new algorithm that achieves a $sqrt2d$-approximation under convex validity.
arXiv Detail & Related papers (2025-06-18T08:40:49Z)
Beyond likelihood ratio bias: Nested multi-time-scale stochastic approximation for likelihood-free parameter estimation [49.78792404811239]
We study inference in simulation-based models where the analytical form of the likelihood is unknown.<n>We use a ratio-free nested multi-time-scale approximation (SA) method that simultaneously tracks the score and drives the parameter update.<n>We show that our algorithm can eliminate the original bias $Obig(sqrtfrac1Nbig)$ and accelerate the convergence rate from $Obig(beta_k+sqrtfracalpha_kNbig)$.
arXiv Detail & Related papers (2024-11-20T02:46:15Z)
Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts [8.700087812420687]
We provide guarantees for the model's performance on a different, unseen network "B"<n>We show how the principled vanilla DKW bound enables certification of the model's true performance on unseen clients within the same (source) network.
arXiv Detail & Related papers (2024-10-26T18:45:15Z)
Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients [27.433334322019675]
Federated Learning (FL) enables clients to collaboratively train models without sharing raw data.<n>FL is vulnerable to Byzantine attacks and data heterogeneity iterations, which can severely degrade performance.<n>We propose effective Federated Normalized Gradients Algorithm (NGA)<n> Experimental results on benchmark convergence over existing methods.
arXiv Detail & Related papers (2024-08-18T16:50:39Z)
Byzantine-Robust Federated Learning: Impact of Client Subsampling and Local Updates [11.616782769625003]
adversarial (a.k.a., em Byzantine) clients makes federated learning (FL) prone to arbitrary manipulation. We show that the rate improvement in learning accuracy em diminishes with respect to the number of subsampled clients. We also observe that under a careful step choice, the learning error due to Byzantine clients decreases with the number of local steps.
arXiv Detail & Related papers (2024-02-20T07:40:11Z)
There is no Accuracy-Interpretability Tradeoff in Reinforcement Learning for Mazes [64.05903267230467]
Interpretability is an essential building block for trustworthiness in reinforcement learning systems. We show that in certain cases, one can achieve policy interpretability while maintaining its optimality.
arXiv Detail & Related papers (2022-06-09T04:23:26Z)
Federated Optimization of Smooth Loss Functions [35.944772011421264]
In this work, we study empirical risk minimization (ERM) within a federated learning framework. We propose the Federated Low Rank Gradient Descent (FedLRGD) algorithm. Our method solves the ERM problem at the server using inexact gradient descent.
arXiv Detail & Related papers (2022-01-06T07:40:51Z)
Linear Speedup in Personalized Collaborative Learning [69.45124829480106]
Personalization in federated learning can improve the accuracy of a model for a user by trading off the model's bias. We formalize the personalized collaborative learning problem as optimization of a user's objective. We explore conditions under which we can optimally trade-off their bias for a reduction in variance.
arXiv Detail & Related papers (2021-11-10T22:12:52Z)
Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss. We examine how these benign overfitting phenomena occur in a two-layer neural network setting. We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z)
Blockchain Assisted Decentralized Federated Learning (BLADE-FL) with Lazy Clients [124.48732110742623]
We propose a novel framework by integrating blockchain into Federated Learning (FL) BLADE-FL has a good performance in terms of privacy preservation, tamper resistance, and effective cooperation of learning. It gives rise to a new problem of training deficiency, caused by lazy clients who plagiarize others' trained models and add artificial noises to conceal their cheating behaviors.
arXiv Detail & Related papers (2020-12-02T12:18:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.