Decoding FL Defenses: Systemization, Pitfalls, and Remedies
- URL: http://arxiv.org/abs/2502.05211v1
- Date: Mon, 03 Feb 2025 23:14:02 GMT
- Title: Decoding FL Defenses: Systemization, Pitfalls, and Remedies
- Authors: Momin Ahmad Khan, Virat Shejwalkar, Yasra Chandio, Amir Houmansadr, Fatima Muhammad Anwar,
- Abstract summary: There are no guidelines for evaluating Federated Learning (FL) defenses.<n>We design a comprehensive systemization of FL defenses along three dimensions.<n>We survey 50 top-tier defense papers and identify the commonly used components in their evaluation setups.
- Score: 16.907513505608666
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: While the community has designed various defenses to counter the threat of poisoning attacks in Federated Learning (FL), there are no guidelines for evaluating these defenses. These defenses are prone to subtle pitfalls in their experimental setups that lead to a false sense of security, rendering them unsuitable for practical deployment. In this paper, we systematically understand, identify, and provide a better approach to address these challenges. First, we design a comprehensive systemization of FL defenses along three dimensions: i) how client updates are processed, ii) what the server knows, and iii) at what stage the defense is applied. Next, we thoroughly survey 50 top-tier defense papers and identify the commonly used components in their evaluation setups. Based on this survey, we uncover six distinct pitfalls and study their prevalence. For example, we discover that around 30% of these works solely use the intrinsically robust MNIST dataset, and 40% employ simplistic attacks, which may inadvertently portray their defense as robust. Using three representative defenses as case studies, we perform a critical reevaluation to study the impact of the identified pitfalls and show how they lead to incorrect conclusions about robustness. We provide actionable recommendations to help researchers overcome each pitfall.
Related papers
- Benchmarking Misuse Mitigation Against Covert Adversaries [80.74502950627736]
Existing language model safety evaluations focus on overt attacks and low-stakes tasks.<n>We develop Benchmarks for Stateful Defenses (BSD), a data generation pipeline that automates evaluations of covert attacks and corresponding defenses.<n>Our evaluations indicate that decomposition attacks are effective misuse enablers, and highlight stateful defenses as a countermeasure.
arXiv Detail & Related papers (2025-06-06T17:33:33Z) - A Critical Evaluation of Defenses against Prompt Injection Attacks [95.81023801370073]
Large Language Models (LLMs) are vulnerable to prompt injection attacks.<n>Several defenses have recently been proposed, often claiming to mitigate these attacks successfully.<n>We argue that existing studies lack a principled approach to evaluating these defenses.
arXiv Detail & Related papers (2025-05-23T19:39:56Z) - SoK: Benchmarking Poisoning Attacks and Defenses in Federated Learning [21.73177249075515]
Federated learning (FL) enables collaborative model training while preserving data privacy, but its decentralized nature exposes it to client-side data poisoning attacks (DPAs) and model poisoning attacks (MPAs)
This paper aims to provide a unified benchmark and analysis of defenses against DPAs and MPAs, clarifying the distinction between these two similar but slightly distinct domains.
arXiv Detail & Related papers (2025-02-06T06:05:00Z) - The VLLM Safety Paradox: Dual Ease in Jailbreak Attack and Defense [56.32083100401117]
We investigate why Vision Large Language Models (VLLMs) are prone to jailbreak attacks.
We then make a key observation: existing defense mechanisms suffer from an textbfover-prudence problem.
We find that the two representative evaluation methods for jailbreak often exhibit chance agreement.
arXiv Detail & Related papers (2024-11-13T07:57:19Z) - Evaluating Adversarial Robustness: A Comparison Of FGSM, Carlini-Wagner Attacks, And The Role of Distillation as Defense Mechanism [0.0]
The study explores adversarial attacks specifically targeted at Deep Neural Networks (DNNs) utilized for image classification.
The research focuses on comprehending the ramifications of two prominent attack methodologies: the Fast Gradient Sign Method (FGSM) and the Carlini-Wagner (CW) approach.
The study proposes the robustness of defensive distillation as a defense mechanism to counter FGSM and CW attacks.
arXiv Detail & Related papers (2024-04-05T17:51:58Z) - Baseline Defenses for Adversarial Attacks Against Aligned Language
Models [109.75753454188705]
Recent work shows that text moderations can produce jailbreaking prompts that bypass defenses.
We look at three types of defenses: detection (perplexity based), input preprocessing (paraphrase and retokenization), and adversarial training.
We find that the weakness of existing discretes for text, combined with the relatively high costs of optimization, makes standard adaptive attacks more challenging for LLMs.
arXiv Detail & Related papers (2023-09-01T17:59:44Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - A Data-Driven Defense against Edge-case Model Poisoning Attacks on Federated Learning [13.89043799280729]
We propose an effective defense against model poisoning attacks in Federated Learning systems.
DataDefense learns a poisoned data detector model which marks each example in the defense dataset as poisoned or clean.
It is able to reduce the attack success rate by at least 40% on standard attack setups and by more than 80% on some setups.
arXiv Detail & Related papers (2023-05-03T10:20:26Z) - Randomness in ML Defenses Helps Persistent Attackers and Hinders
Evaluators [49.52538232104449]
It is becoming increasingly imperative to design robust ML defenses.
Recent work has found that many defenses that initially resist state-of-the-art attacks can be broken by an adaptive adversary.
We take steps to simplify the design of defenses and argue that white-box defenses should eschew randomness when possible.
arXiv Detail & Related papers (2023-02-27T01:33:31Z) - Measuring Equality in Machine Learning Security Defenses: A Case Study
in Speech Recognition [56.69875958980474]
This work considers approaches to defending learned systems and how security defenses result in performance inequities across different sub-populations.
We find that many methods that have been proposed can cause direct harm, like false rejection and unequal benefits from robustness training.
We present a comparison of equality between two rejection-based defenses: randomized smoothing and neural rejection, finding randomized smoothing more equitable due to the sampling mechanism for minority groups.
arXiv Detail & Related papers (2023-02-17T16:19:26Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Increasing Confidence in Adversarial Robustness Evaluations [53.2174171468716]
We propose a test to identify weak attacks and thus weak defense evaluations.
Our test slightly modifies a neural network to guarantee the existence of an adversarial example for every sample.
For eleven out of thirteen previously-published defenses, the original evaluation of the defense fails our test, while stronger attacks that break these defenses pass it.
arXiv Detail & Related papers (2022-06-28T13:28:13Z) - Evaluating Gradient Inversion Attacks and Defenses in Federated Learning [43.993693910541275]
This paper evaluates existing attacks and defenses against gradient inversion attacks.
We show the trade-offs of privacy leakage and data utility of three proposed defense mechanisms.
Our findings suggest that the state-of-the-art attacks can currently be defended against with minor data utility loss.
arXiv Detail & Related papers (2021-11-30T19:34:16Z) - Back to the Drawing Board: A Critical Evaluation of Poisoning Attacks on
Federated Learning [32.88150721857589]
Recent works have indicated that federated learning (FL) is vulnerable to poisoning attacks by compromised clients.
We show that these works make a number of unrealistic assumptions and arrive at somewhat misleading conclusions.
We perform the first critical analysis of poisoning attacks under practical production FL environments.
arXiv Detail & Related papers (2021-08-23T15:29:45Z) - TROJANZOO: Everything you ever wanted to know about neural backdoors
(but were afraid to ask) [28.785693760449604]
TROJANZOO is the first open-source platform for evaluating neural backdoor attacks/defenses.
It has 12 representative attacks, 15 state-of-the-art defenses, 6 attack performance metrics, 10 defense utility metrics, as well as rich tools for analysis of attack-defense interactions.
We conduct a systematic study of existing attacks/defenses, leading to a number of interesting findings.
arXiv Detail & Related papers (2020-12-16T22:37:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.