FedBAP: Backdoor Defense via Benign Adversarial Perturbation in Federated Learning
- URL: http://arxiv.org/abs/2507.21177v1
- Date: Sat, 26 Jul 2025 07:00:32 GMT
- Title: FedBAP: Backdoor Defense via Benign Adversarial Perturbation in Federated Learning
- Authors: Xinhai Yan, Libing Wu, Zhuangzhuang Zhang, Bingyi Liu, Lijuan Huo, Jing Wang,
- Abstract summary: Federated Learning (FL) enables collaborative model training while preserving data privacy.<n>Most existing defense methods in FL have limited effectiveness due to their neglect of the model's over-reliance on backdoor triggers.<n>We propose FedBAP, a novel framework for mitigating backdoor attacks in FL by reducing the model's reliance on backdoor triggers.
- Score: 9.79790722384987
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated Learning (FL) enables collaborative model training while preserving data privacy, but it is highly vulnerable to backdoor attacks. Most existing defense methods in FL have limited effectiveness due to their neglect of the model's over-reliance on backdoor triggers, particularly as the proportion of malicious clients increases. In this paper, we propose FedBAP, a novel defense framework for mitigating backdoor attacks in FL by reducing the model's reliance on backdoor triggers. Specifically, first, we propose a perturbed trigger generation mechanism that creates perturbation triggers precisely matching backdoor triggers in location and size, ensuring strong influence on model outputs. Second, we utilize these perturbation triggers to generate benign adversarial perturbations that disrupt the model's dependence on backdoor triggers while forcing it to learn more robust decision boundaries. Finally, we design an adaptive scaling mechanism to dynamically adjust perturbation intensity, effectively balancing defense strength and model performance. The experimental results demonstrate that FedBAP reduces the attack success rates by 0.22%-5.34%, 0.48%-6.34%, and 97.22%-97.6% under three types of backdoor attacks, respectively. In particular, FedBAP demonstrates outstanding performance against novel backdoor attacks.
Related papers
- Coward: Toward Practical Proactive Federated Backdoor Defense via Collision-based Watermark [90.94234374893287]
We introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects.<n>In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained.
arXiv Detail & Related papers (2025-08-04T06:51:33Z) - An Effective and Resilient Backdoor Attack Framework against Deep Neural Networks and Vision Transformers [22.77836113915616]
We propose a novel attention-based mask generation methodology that searches for the optimal trigger shape and location.<n>We also introduce a Quality-of-Experience term into the loss function and carefully adjust the transparency value of the trigger.<n>Our proposed backdoor attack framework also showcases robustness against state-of-the-art backdoor defenses.
arXiv Detail & Related papers (2024-12-09T02:03:27Z) - DeTrigger: A Gradient-Centric Approach to Backdoor Attack Mitigation in Federated Learning [8.745529957589039]
Federated Learning (FL) enables collaborative model training across distributed devices while preserving local data privacy, making it ideal for mobile and embedded systems.<n>However, the decentralized nature of FL also opens vulnerabilities to model poisoning attacks, particularly backdoor attacks.<n>We propose DeTrigger, a scalable and efficient backdoor-robust federated learning framework.
arXiv Detail & Related papers (2024-11-19T04:12:14Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models [57.5404308854535]
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions.
We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space.
Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations.
arXiv Detail & Related papers (2024-06-24T19:29:47Z) - Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning [20.69655306650485]
Federated Learning (FL) is a decentralized machine learning method that enables participants to collaboratively train a model without sharing their private data.
Despite its privacy and scalability benefits, FL is susceptible to backdoor attacks.
We propose DPOT, a backdoor attack strategy in FL that dynamically constructs backdoor objectives by optimizing a backdoor trigger.
arXiv Detail & Related papers (2024-05-10T02:44:25Z) - G$^2$uardFL: Safeguarding Federated Learning Against Backdoor Attacks
through Attributed Client Graph Clustering [116.4277292854053]
Federated Learning (FL) offers collaborative model training without data sharing.
FL is vulnerable to backdoor attacks, where poisoned model weights lead to compromised system integrity.
We present G$2$uardFL, a protective framework that reinterprets the identification of malicious clients as an attributed graph clustering problem.
arXiv Detail & Related papers (2023-06-08T07:15:04Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - FedGrad: Mitigating Backdoor Attacks in Federated Learning Through Local
Ultimate Gradients Inspection [3.3711670942444014]
Federated learning (FL) enables multiple clients to train a model without compromising sensitive data.
The decentralized nature of FL makes it susceptible to adversarial attacks, especially backdoor insertion during training.
We propose FedGrad, a backdoor-resistant defense for FL that is resistant to cutting-edge backdoor attacks.
arXiv Detail & Related papers (2023-04-29T19:31:44Z) - Mitigating Backdoors in Federated Learning with FLD [7.908496863030483]
Federated learning allows clients to collaboratively train a global model without uploading raw data for privacy preservation.
This feature has recently been found responsible for federated learning's vulnerability in the face of backdoor attacks.
We propose Federated Layer Detection (FLD), a novel model filtering approach for effectively defending against backdoor attacks.
arXiv Detail & Related papers (2023-03-01T07:54:54Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.