Related papers: Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning

Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning

URL: http://arxiv.org/abs/2506.18020v2
Date: Thu, 16 Oct 2025 16:05:21 GMT
Title: Byzantine Failures Harm the Generalization of Robust Distributed Learning Algorithms More Than Data Poisoning
Authors: Thomas Boudou, Batiste Le Bars, Nirupam Gupta, Aurélien Bellet,
Abstract summary: A robust distributed learning algorithm aims to maintain reliable performance despite the presence of misbehaving workers.<n>Such misbehaviors are commonly modeled as $textitByzantine failures$, allowing arbitrarily corrupted communication, or as $textitdata poisoning$, a weaker form of corruption restricted to local training data.<n>We show for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning.
Score: 19.624245500772027
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Robust distributed learning algorithms aim to maintain reliable performance despite the presence of misbehaving workers. Such misbehaviors are commonly modeled as $\textit{Byzantine failures}$, allowing arbitrarily corrupted communication, or as $\textit{data poisoning}$, a weaker form of corruption restricted to local training data. While prior work shows similar optimization guarantees for both models, an important question remains: $\textit{How do these threat models impact generalization?}$ Empirical evidence suggests a gap, yet it remains unclear whether it is unavoidable or merely an artifact of suboptimal attacks. We show, for the first time, a fundamental gap in generalization guarantees between the two threat models: Byzantine failures yield strictly worse rates than those achievable under data poisoning. Our findings leverage a tight algorithmic stability analysis of robust distributed learning. Specifically, we prove that: $\textit{(i)}$ under data poisoning, the uniform algorithmic stability of an algorithm with optimal optimization guarantees degrades by an additive factor of $\varTheta ( \frac{f}{n-f} )$, with $f$ out of $n$ workers misbehaving; whereas $\textit{(ii)}$ under Byzantine failures, the degradation is in $\Omega \big( \sqrt{ \frac{f}{n-2f}} \big)$.

Related papers

High-Dimensional Robust Mean Estimation with Untrusted Batches [38.14592862692954]
We study high-dimensional mean estimation in a collaborative setting where data is contributed by $N$ users in batches of size $n$.<n>We formalize this challenge through a double corruption landscape: an $varepsilon$-fraction of users are entirely adversarial, while the remaining good'' users provide data from distributions that are related to $P$, but deviate by a proximity parameter $$.<n>Our algorithms achieve the minimax-optimal error rate $O(sqrtvarepsilon/n + sqrtd/nN + s
arXiv Detail & Related papers (2026-02-24T08:59:37Z)
Trade-off in Estimating the Number of Byzantine Clients in Federated Learning [51.015234193544636]
Federated learning is vulnerable to Byzantine clients that can send any erroneous signals.<n>We show that underestimation ($hatfge f$) can lead to arbitrarily poor performance for both aggregators and federated learning.<n>All these optimal bounds are proportional to $hatf/(n-f-hatf)$ with $n$ clients, which monotonically increases with larger $hatf$.
arXiv Detail & Related papers (2025-10-06T02:01:56Z)
Distributionally Robust Optimization with Adversarial Data Contamination [49.89480853499918]
We focus on optimizing Wasserstein-1 DRO objectives for generalized linear models with convex Lipschitz loss functions.<n>Our primary contribution lies in a novel modeling framework that integrates robustness against training data contamination with robustness against distributional shifts.<n>This work establishes the first rigorous guarantees, supported by efficient computation, for learning under the dual challenges of data contamination and distributional shifts.
arXiv Detail & Related papers (2025-07-14T18:34:10Z)
Towards Model Resistant to Transferable Adversarial Examples via Trigger Activation [95.3977252782181]
Adversarial examples, characterized by imperceptible perturbations, pose significant threats to deep neural networks by misleading their predictions.<n>We introduce a novel training paradigm aimed at enhancing robustness against transferable adversarial examples (TAEs) in a more efficient and effective way.
arXiv Detail & Related papers (2025-04-20T09:07:10Z)
Efficient Federated Learning against Byzantine Attacks and Data Heterogeneity via Aggregating Normalized Gradients [27.433334322019675]
Federated Learning (FL) enables clients to collaboratively train models without sharing raw data.<n>FL is vulnerable to Byzantine attacks and data heterogeneity iterations, which can severely degrade performance.<n>We propose effective Federated Normalized Gradients Algorithm (NGA)<n> Experimental results on benchmark convergence over existing methods.
arXiv Detail & Related papers (2024-08-18T16:50:39Z)
$k$-Submodular Interdiction Problems under Distributional Risk-Receptiveness and Robustness: Application to Machine Learning [0.8233493213841317]
We study submodular optimization in adversarial context, applicable to machine learning problems such as feature selection.<n>We focus on Stackelberg games between an attacker (or interdictor) and a defender where the attacker aims to minimize the defender's objective of maximizing a $k$-submodular function.<n>We introduce Distributionally Robust $k$-SIP and Distributionally Risk-Receptive $k$-SIP along with finitely convergent exact algorithms for solving them.
arXiv Detail & Related papers (2024-06-18T19:30:46Z)
Advancing Generalized Transfer Attack with Initialization Derived Bilevel Optimization and Dynamic Sequence Truncation [49.480978190805125]
Transfer attacks generate significant interest for black-box applications. Existing works essentially directly optimize the single-level objective w.r.t. surrogate model. We propose a bilevel optimization paradigm, which explicitly reforms the nested relationship between the Upper-Level (UL) pseudo-victim attacker and the Lower-Level (LL) surrogate attacker.
arXiv Detail & Related papers (2024-06-04T07:45:27Z)
Federated Learning Resilient to Byzantine Attacks and Data Heterogeneity [59.17297282373628]
This paper addresses Gradient learning (FL) in the context of malicious attacks on data.<n>We introduce a novel Average Robust Algorithm (RAGA) which uses the median for both convergence analysis and loss functions.
arXiv Detail & Related papers (2024-03-20T08:15:08Z)
Corruption-Robust Offline Reinforcement Learning with General Function Approximation [60.91257031278004]
We investigate the problem of corruption in offline reinforcement learning (RL) with general function approximation. Our goal is to find a policy that is robust to such corruption and minimizes the suboptimality gap with respect to the optimal policy for the uncorrupted Markov decision processes (MDPs)
arXiv Detail & Related papers (2023-10-23T04:07:26Z)
An Improved Analysis of Gradient Tracking for Decentralized Machine Learning [34.144764431505486]
We consider decentralized machine learning over a network where the training data is distributed across $n$ agents. The agent's common goal is to find a model that minimizes the average of all local loss functions. We improve the dependency on $p$ from $mathcalO(p-1)$ to $mathcalO(p-1)$ in the noiseless case.
arXiv Detail & Related papers (2022-02-08T12:58:14Z)
Towards Variable-Length Textual Adversarial Attacks [68.27995111870712]
It is non-trivial to conduct textual adversarial attacks on natural language processing tasks due to the discreteness of data. In this paper, we propose variable-length textual adversarial attacks(VL-Attack) Our method can achieve $33.18$ BLEU score on IWSLT14 German-English translation, achieving an improvement of $1.47$ over the baseline model.
arXiv Detail & Related papers (2021-04-16T14:37:27Z)
Online Adversarial Attacks [57.448101834579624]
We formalize the online adversarial attack problem, emphasizing two key elements found in real-world use-cases. We first rigorously analyze a deterministic variant of the online threat model. We then propose algoname, a simple yet practical algorithm yielding a provably better competitive ratio for $k=2$ over the current best single threshold algorithm.
arXiv Detail & Related papers (2021-03-02T20:36:04Z)
Byzantine-Resilient Non-Convex Stochastic Gradient Descent [61.6382287971982]
adversary-resilient distributed optimization, in which. machines can independently compute gradients, and cooperate. Our algorithm is based on a new concentration technique, and its sample complexity. It is very practical: it improves upon the performance of all prior methods when no. setting machines are present.
arXiv Detail & Related papers (2020-12-28T17:19:32Z)
Byzantine-Robust Learning on Heterogeneous Datasets via Bucketing [55.012801269326594]
In Byzantine robust distributed learning, a central server wants to train a machine learning model over data distributed across multiple workers. A fraction of these workers may deviate from the prescribed algorithm and send arbitrary messages. We propose a simple bucketing scheme that adapts existing robust algorithms to heterogeneous datasets at a negligible computational cost.
arXiv Detail & Related papers (2020-06-16T17:58:53Z)
Estimating Principal Components under Adversarial Perturbations [25.778123431786653]
We study a natural model of robustness for high-dimensional statistical estimation problems. Our model is motivated by emerging paradigms such as low precision machine learning and adversarial training.
arXiv Detail & Related papers (2020-05-31T20:27:19Z)
Toward Adversarial Robustness via Semi-supervised Robust Training [93.36310070269643]
Adrial examples have been shown to be the severe threat to deep neural networks (DNNs) We propose a novel defense method, the robust training (RT), by jointly minimizing two separated risks ($R_stand$ and $R_rob$)
arXiv Detail & Related papers (2020-03-16T02:14:08Z)

This list is automatically generated from the titles and abstracts of the papers in this site.