Related papers: Exploit Gradient Skewness to Circumvent Byzantine Defenses for Federated Learning

Exploit Gradient Skewness to Circumvent Byzantine Defenses for Federated Learning

URL: http://arxiv.org/abs/2502.04890v2
Date: Fri, 14 Feb 2025 12:36:02 GMT
Title: Exploit Gradient Skewness to Circumvent Byzantine Defenses for Federated Learning
Authors: Yuchen Liu, Chen Chen, Lingjuan Lyu, Yaochu Jin, Gang Chen,
Abstract summary: Federated Learning (FL) is notorious for its vulnerability to Byzantine attacks.<n>Most current Byzantine defenses share a common inductive bias: among all the gradients, the densely distributed ones are more likely to be honest.<n>We discover that a group of densely distributed honest gradients skew away from the optimal gradient due to heterogeneous data.<n>We propose a novel skew-aware attack called STRIKE: first, we search for the skewed gradients; then, we construct Byzantine gradients within the skewed gradients.
Score: 54.36263862306616
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Federated Learning (FL) is notorious for its vulnerability to Byzantine attacks. Most current Byzantine defenses share a common inductive bias: among all the gradients, the densely distributed ones are more likely to be honest. However, such a bias is a poison to Byzantine robustness due to a newly discovered phenomenon in this paper - gradient skew. We discover that a group of densely distributed honest gradients skew away from the optimal gradient (the average of honest gradients) due to heterogeneous data. This gradient skew phenomenon allows Byzantine gradients to hide within the densely distributed skewed gradients. As a result, Byzantine defenses are confused into believing that Byzantine gradients are honest. Motivated by this observation, we propose a novel skew-aware attack called STRIKE: first, we search for the skewed gradients; then, we construct Byzantine gradients within the skewed gradients. Experiments on three benchmark datasets validate the effectiveness of our attack

Related papers

A Weighted Loss Approach to Robust Federated Learning under Data Heterogeneity [2.6355823502823195]
Federated learning (FL) enables multiple data holders to collaboratively train a machine learning model without sharing their training data with external parties.<n>While FL seems appealing from a privacy perspective, it opens a number of threats from a security perspective as (Byzantine) participants can contribute poisonous gradients (or model parameters) harming model convergence.<n>We introduce the Worker Labelement Loss (WoLA), a weighted loss that aligns honest worker gradients despite data heterogeneity.
arXiv Detail & Related papers (2025-06-11T15:04:43Z)
FedRISE: Rating Induced Sign Election of Gradients for Byzantine Tolerant Federated Aggregation [5.011091042850546]
We develop a robust aggregator called FedRISE for cross-silo FL that is consistent and less susceptible to poisoning updates by an omniscient attacker. We compare our method against 8 robust aggregators under 6 poisoning attacks on 3 datasets and architectures. Our results show that existing robust aggregators collapse for at least some attacks under severe settings, while FedRISE demonstrates better robustness because of a stringent gradient inclusion formulation.
arXiv Detail & Related papers (2024-11-06T12:14:11Z)
Why is parameter averaging beneficial in SGD? An objective smoothing perspective [13.863368438870562]
gradient descent (SGD) and its implicit bias are often characterized in terms of the sharpness of the minima. We study the commonly-used averaged SGD algorithm, which has been empirically observed in Izmailov et al. We prove that averaged SGD can efficiently optimize the smoothed objective which avoids sharp local minima.
arXiv Detail & Related papers (2023-02-18T16:29:06Z)
On the Overlooked Structure of Stochastic Gradients [34.650998241703626]
We show that dimension-wise gradients usually exhibit power-law heavy tails, while iteration-wise gradients and gradient noise caused by minibatch training usually do not exhibit power-law heavy tails. Our work challenges the existing belief and provides novel insights on the structure of gradients in deep learning.
arXiv Detail & Related papers (2022-12-05T07:55:22Z)
Are Gradients on Graph Structure Reliable in Gray-box Attacks? [56.346504691615934]
Previous gray-box attackers employ gradients from the surrogate model to locate the vulnerable edges to perturb the graph structure. In this paper, we discuss and analyze the errors caused by the unreliability of the structural gradients. We propose a novel attack model with methods to reduce the errors inside the structural gradients.
arXiv Detail & Related papers (2022-08-07T06:43:32Z)
Do Perceptually Aligned Gradients Imply Adversarial Robustness? [17.929524924008962]
Adversarially robust classifiers possess a trait that non-robust models do not -- Perceptually Aligned Gradients (PAG) Several works have identified PAG as a byproduct of robust training, but none have considered it as a standalone phenomenon nor studied its own implications. We show that better gradient alignment leads to increased robustness and harness this observation to boost the robustness of existing adversarial training techniques.
arXiv Detail & Related papers (2022-07-22T23:48:26Z)
A Survey on Gradient Inversion: Attacks, Defenses and Future Directions [81.46745643749513]
We present a comprehensive survey on GradInv, aiming to summarize the cutting-edge research and broaden the horizons for different domains. Firstly, we propose a taxonomy of GradInv attacks by characterizing existing attacks into two paradigms: iteration- and recursion-based attacks. Second, we summarize emerging defense strategies against GradInv attacks. We find these approaches focus on three perspectives covering data obscuration, model improvement and gradient protection.
arXiv Detail & Related papers (2022-06-15T03:52:51Z)
Adversarially Robust Classification by Conditional Generative Model Inversion [4.913248451323163]
We propose a classification model that does not obfuscate gradients and is robust by construction without assuming prior knowledge about the attack. Our method casts classification as an optimization problem where we "invert" a conditional generator trained on unperturbed, natural images. We demonstrate that our model is extremely robust against black-box attacks and has improved robustness against white-box attacks.
arXiv Detail & Related papers (2022-01-12T23:11:16Z)
Byzantine-robust Federated Learning through Collaborative Malicious Gradient Filtering [32.904425716385575]
We show that element-wise sign of gradient vector can provide valuable insight in detecting model poisoning attacks. We propose a novel approach called textitSignGuard to enable Byzantine-robust federated learning through collaborative malicious gradient filtering.
arXiv Detail & Related papers (2021-09-13T11:15:15Z)
Staircase Sign Method for Boosting Adversarial Attacks [123.19227129979943]
Crafting adversarial examples for the transfer-based attack is challenging and remains a research hot spot. We propose a novel Staircase Sign Method (S$2$M) to alleviate this issue, thus boosting transfer-based attacks. Our method can be generally integrated into any transfer-based attacks, and the computational overhead is negligible.
arXiv Detail & Related papers (2021-04-20T02:31:55Z)
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness [75.30116479840619]
In this paper, we identify a more subtle situation called Imbalanced Gradients that can also cause overestimated adversarial robustness. The phenomenon of imbalanced gradients occurs when the gradient of one term of the margin loss dominates and pushes the attack towards a suboptimal direction. We propose a Margin Decomposition (MD) attack that decomposes a margin loss into individual terms and then explores the attackability of these terms separately.
arXiv Detail & Related papers (2020-06-24T13:41:37Z)
Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks [74.36161581953658]
This paper deals with distributed finite-sum optimization for learning over networks in the presence of malicious Byzantine attacks. To cope with such attacks, most resilient approaches so far combine gradient descent (SGD) with different robust aggregation rules. The present work puts forth a Byzantine attack resilient distributed (Byrd-) SAGA approach for learning tasks involving finite-sum optimization over networks.
arXiv Detail & Related papers (2019-12-29T19:46:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.