Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning
- URL: http://arxiv.org/abs/2506.18020v2
- Date: Thu, 16 Oct 2025 16:05:21 GMT
- Title: Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning
- Authors: Thomas Boudou, Batiste Le Bars, Nirupam Gupta, Aurélien Bellet,
- Abstract summary: A robust distributed learning algorithm aims to maintain reliable performance despite the presence of misbehaving workers.<n>Such misbehaviors are commonly modeled as $textitByzantine failures$, allowing arbitrarily corrupted communication, or as $textitdata poisoning$, a weaker form of corruption restricted to local training data.<n>We show for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning.
- Score: 19.624245500772027
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust distributed learning algorithms aim to maintain reliable performance despite the presence of misbehaving workers. Such misbehaviors are commonly modeled as $\textit{Byzantine failures}$, allowing arbitrarily corrupted communication, or as $\textit{data poisoning}$, a weaker form of corruption restricted to local training data. While prior work shows similar optimization guarantees for both models, an important question remains: $\textit{How do these threat models impact generalization?}$ Empirical evidence suggests a gap, yet it remains unclear whether it is unavoidable or merely an artifact of suboptimal attacks. We show, for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning. Our findings leverage a tight algorithmic stability analysis of robust distributed learning. Specifically, we prove that: $\textit{(i)}$ under data poisoning, the uniform algorithmic stability of an algorithm with optimal optimization guarantees degrades by an additive factor of $\varTheta ( \frac{f}{n-f} )$, with $f$ out of $n$ workers misbehaving; whereas $\textit{(ii)}$ under Byzantine failures, the degradation is in $\Omega \big( \sqrt{ \frac{f}{n-2f}} \big)$.
Related papers
- High-Dimensional Robust Mean Estimation with Untrusted Batches [38.14592862692954]
We study high-dimensional mean estimation in a collaborative setting where data is contributed by $N$ users in batches of size $n$.<n>We formalize this challenge through a double corruption landscape: an $varepsilon$-fraction of users are entirely adversarial, while the remaining good'' users provide data from distributions that are related to $P$, but deviate by a proximity parameter $$.<n>Our algorithms achieve the minimax-optimal error rate $O(sqrtvarepsilon/n + sqrtd/nN + s
arXiv Detail & Related papers (2026-02-24T08:59:37Z) - Trade-off in Estimating the Number of Byzantine Clients in Federated Learning [51.015234193544636]
Federated learning is vulnerable to Byzantine clients that can send any erroneous signals.<n>We show that underestimation ($hatfge f$) can lead to arbitrarily poor performance for both aggregators and federated learning.<n>All these optimal bounds are proportional to $hatf/(n-f-hatf)$ with $n$ clients, which monotonically increases with larger $hatf$.
arXiv Detail & Related papers (2025-10-06T02:01:56Z) - Distributionally Robust Optimization with Adversarial Data Contamination [49.89480853499918]
We focus on optimizing Wasserstein-1 DRO objectives for generalized linear models with convex Lipschitz loss functions.<n>Our primary contribution lies in a novel modeling framework that integrates robustness against training data contamination with robustness against distributional shifts.<n>This work establishes the first rigorous guarantees, supported by efficient computation, for learning under the dual challenges of data contamination and distributional shifts.
arXiv Detail & Related papers (2025-07-14T18:34:10Z) - Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation [95.3977252782181]
Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions.<n>We introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way.
arXiv Detail & Related papers (2025-04-20T09:07:10Z) - Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients [27.433334322019675]
Federated Learning (FL) enables clients to collaboratively train models without sharing raw data.<n>FL is vulnerable to Byzantine attacks and data heterogeneity iterations, which can severely degrade performance.<n>We propose effective Federated Normalized Gradients Algorithm (NGA)<n> Experimental results on benchmark convergence over existing methods.
arXiv Detail & Related papers (2024-08-18T16:50:39Z) - $k$-Submodular Interdiction Problems under Distributional Risk-Receptiveness and Robustness: Application to Machine Learning [0.8233493213841317]
We study submodular optimization in adversarial context, applicable to machine learning problems such as feature selection.<n>We focus on Stackelberg games between an attacker (or interdictor) and a defender where the attacker aims to minimize the defender's objective of maximizing a $k$-submodular function.<n>We introduce Distributionally Robust $k$-SIP and Distributionally Risk-Receptive $k$-SIP along with finitely convergent exact algorithms for solving them.
arXiv Detail & Related papers (2024-06-18T19:30:46Z) - Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications.
Existing works essentially directly optimize the single-level objective w.r.t. surrogate model.
We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z) - Federated Learning Resilient to Byzantine Attacks and Data Heterogeneity [59.17297282373628]
This paper addresses Gradient learning (FL) in the context of malicious attacks on data.<n>We introduce a novel Average Robust Algorithm (RAGA) which uses the median for both convergence analysis and loss functions.
arXiv Detail & Related papers (2024-03-20T08:15:08Z) - Corruption-Robust Offline Reinforcement Learning with General Function
Approximation [60.91257031278004]
We investigate the problem of corruption in offline reinforcement learning (RL) with general function approximation.
Our goal is to find a policy that is robust to such corruption and minimizes the suboptimality gap with respect to the optimal policy for the uncorrupted Markov decision processes (MDPs)
arXiv Detail & Related papers (2023-10-23T04:07:26Z) - An Improved Analysis of Gradient Tracking for Decentralized Machine
Learning [34.144764431505486]
We consider decentralized machine learning over a network where the training data is distributed across $n$ agents.
The agent's common goal is to find a model that minimizes the average of all local loss functions.
We improve the dependency on $p$ from $mathcalO(p-1)$ to $mathcalO(p-1)$ in the noiseless case.
arXiv Detail & Related papers (2022-02-08T12:58:14Z) - Towards Variable-Length Textual Adversarial Attacks [68.27995111870712]
It is non-trivial to conduct textual adversarial attacks on natural language processing tasks due to the discreteness of data.
In this paper, we propose variable-length textual adversarial attacks(VL-Attack)
Our method can achieve $33.18$ BLEU score on IWSLT14 German-English translation, achieving an improvement of $1.47$ over the baseline model.
arXiv Detail & Related papers (2021-04-16T14:37:27Z) - Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases.
We first rigorously analyze a deterministic variant of the online threat model.
We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z) - Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which.
machines can independently compute gradients, and cooperate.
Our algorithm is based on a new concentration technique, and its sample complexity.
It is very practical: it improves upon the performance of all prior methods when no.
setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z) - Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing [55.012801269326594]
In Byzantine robust distributed learning, a central server wants to train a machine learning model over data distributed across multiple workers.
A fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages.
We propose a simple bucketing scheme that adapts existing robust algorithms to heterogeneous datasets at a negligible computational cost.
arXiv Detail & Related papers (2020-06-16T17:58:53Z) - Estimating Principal Components under Adversarial Perturbations [25.778123431786653]
We study a natural model of robustness for high-dimensional statistical estimation problems.
Our model is motivated by emerging paradigms such as low precision machine learning and adversarial training.
arXiv Detail & Related papers (2020-05-31T20:27:19Z) - Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs)
We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.