Coward: Toward Practical Proactive Federated Backdoor Defense via Collision-based Watermark
- URL: http://arxiv.org/abs/2508.02115v1
- Date: Mon, 04 Aug 2025 06:51:33 GMT
- Title: Coward: Toward Practical Proactive Federated Backdoor Defense via Collision-based Watermark
- Authors: Wenjie Li, Siying Gu, Yiming Li, Kangjie Chen, Zhili Chen, Tianwei Zhang, Shu-Tao Xia, Dacheng Tao,
- Abstract summary: We introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects.<n>In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained.
- Score: 90.94234374893287
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Backdoor detection is currently the mainstream defense against backdoor attacks in federated learning (FL), where malicious clients upload poisoned updates that compromise the global model and undermine the reliability of FL deployments. Existing backdoor detection techniques fall into two categories, including passive and proactive ones, depending on whether the server proactively modifies the global model. However, both have inherent limitations in practice: passive defenses are vulnerable to common non-i.i.d. data distributions and random participation of FL clients, whereas current proactive defenses suffer inevitable out-of-distribution (OOD) bias because they rely on backdoor co-existence effects. To address these issues, we introduce a new proactive defense, dubbed Coward, inspired by our discovery of multi-backdoor collision effects, in which consecutively planted, distinct backdoors significantly suppress earlier ones. In general, we detect attackers by evaluating whether the server-injected, conflicting global watermark is erased during local training rather than retained. Our method preserves the advantages of proactive defenses in handling data heterogeneity (\ie, non-i.i.d. data) while mitigating the adverse impact of OOD bias through a revised detection mechanism. Extensive experiments on benchmark datasets confirm the effectiveness of Coward and its resilience to potential adaptive attacks. The code for our method would be available at https://github.com/still2009/cowardFL.
Related papers
- Client-Side Patching against Backdoor Attacks in Federated Learning [0.0]
Federated learning is vulnerable to backdoor attacks launched by malicious participants.<n>We propose a novel defense mechanism for federated learning systems designed to mitigate backdoor attacks on the clients-side.<n>Our approach leverages adversarial learning techniques and model patching to neutralize the impact of backdoor attacks.
arXiv Detail & Related papers (2024-12-13T23:17:10Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning [20.69655306650485]
Federated Learning (FL) is a decentralized machine learning method that enables participants to collaboratively train a model without sharing their private data.
Despite its privacy and scalability benefits, FL is susceptible to backdoor attacks.
We propose DPOT, a backdoor attack strategy in FL that dynamically constructs backdoor objectives by optimizing a backdoor trigger.
arXiv Detail & Related papers (2024-05-10T02:44:25Z) - BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive
Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses.
We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - FedGrad: Mitigating Backdoor Attacks in Federated Learning Through Local
Ultimate Gradients Inspection [3.3711670942444014]
Federated learning (FL) enables multiple clients to train a model without compromising sensitive data.
The decentralized nature of FL makes it susceptible to adversarial attacks, especially backdoor insertion during training.
We propose FedGrad, a backdoor-resistant defense for FL that is resistant to cutting-edge backdoor attacks.
arXiv Detail & Related papers (2023-04-29T19:31:44Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - BaFFLe: Backdoor detection via Feedback-based Federated Learning [3.6895394817068357]
We propose Backdoor detection via Feedback-based Federated Learning (BAFFLE)
We show that BAFFLE reliably detects state-of-the-art backdoor attacks with a detection accuracy of 100% and a false-positive rate below 5%.
arXiv Detail & Related papers (2020-11-04T07:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.