TrojanDam: Detection-Free Backdoor Defense in Federated Learning through Proactive Model Robustification utilizing OOD Data
- URL: http://arxiv.org/abs/2504.15674v1
- Date: Tue, 22 Apr 2025 07:56:51 GMT
- Title: TrojanDam: Detection-Free Backdoor Defense in Federated Learning through Proactive Model Robustification utilizing OOD Data
- Authors: Yanbo Dai, Songze Li, Zihan Gan, Xueluan Gong,
- Abstract summary: decentralization allows adversaries to craft carefully designed backdoor updates to misclassify a global model.<n>Existing defense mechanisms mainly rely on post-training detection after receiving updates.<n>We propose a backdoor defense paradigm, which focuses on proactive robustification on the global model.
- Score: 9.078056680217465
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Federated learning (FL) systems allow decentralized data-owning clients to jointly train a global model through uploading their locally trained updates to a centralized server. The property of decentralization enables adversaries to craft carefully designed backdoor updates to make the global model misclassify only when encountering adversary-chosen triggers. Existing defense mechanisms mainly rely on post-training detection after receiving updates. These methods either fail to identify updates which are deliberately fabricated statistically close to benign ones, or show inconsistent performance in different FL training stages. The effect of unfiltered backdoor updates will accumulate in the global model, and eventually become functional. Given the difficulty of ruling out every backdoor update, we propose a backdoor defense paradigm, which focuses on proactive robustification on the global model against potential backdoor attacks. We first reveal that the successful launching of backdoor attacks in FL stems from the lack of conflict between malicious and benign updates on redundant neurons of ML models. We proceed to prove the feasibility of activating redundant neurons utilizing out-of-distribution (OOD) samples in centralized settings, and migrating to FL settings to propose a novel backdoor defense mechanism, TrojanDam. The proposed mechanism has the FL server continuously inject fresh OOD mappings into the global model to activate redundant neurons, canceling the effect of backdoor updates during aggregation. We conduct systematic and extensive experiments to illustrate the superior performance of TrojanDam, over several SOTA backdoor defense methods across a wide range of FL settings.
Related papers
- REFINE: Inversion-Free Backdoor Defense via Model Reprogramming [60.554146386198376]
Backdoor attacks on deep neural networks (DNNs) have emerged as a significant security threat.<n>We propose REFINE, an inversion-free backdoor defense method based on model reprogramming.
arXiv Detail & Related papers (2025-02-22T07:29:12Z) - BadSFL: Backdoor Attack against Scaffold Federated Learning [16.104941796138128]
Federated learning (FL) enables the training of deep learning models on distributed clients to preserve data privacy.
BadSFL is a novel backdoor attack method designed for the FL framework using the scaffold aggregation algorithm in non-IID settings.
BadSFL is effective over 60 rounds in the global model and up to 3 times longer than existing baseline attacks.
arXiv Detail & Related papers (2024-11-25T07:46:57Z) - Securing Federated Learning Against Novel and Classic Backdoor Threats During Foundation Model Integration [8.191214701984162]
Federated learning (FL) enables decentralized model training while preserving privacy.
Recently, integrating Foundation Models (FMs) into FL has boosted performance but also introduced a novel backdoor attack mechanism.
We propose a novel data-free defense strategy by constraining abnormal activations in the hidden feature space during model aggregation on the server.
arXiv Detail & Related papers (2024-10-23T05:54:41Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - BackdoorIndicator: Leveraging OOD Data for Proactive Backdoor Detection in Federated Learning [7.528642177161784]
In a federated learning (FL) system, decentralized data owners (clients) could upload their locally trained models to a central server, to jointly train a global model.
Malicious clients may plant backdoors into the global model through uploading poisoned local models, causing misclassification to a target class when encountering attacker-defined triggers.
Existing backdoor defenses show inconsistent performance under different system and adversarial settings, especially when the malicious updates are made statistically close to the benign ones.
We propose a novel proactive backdoor detection mechanism for FL named BackdoorIndicator, which has the server inject indicator tasks into the global model leveraging out-
arXiv Detail & Related papers (2024-05-31T14:44:57Z) - Concealing Backdoor Model Updates in Federated Learning by Trigger-Optimized Data Poisoning [20.69655306650485]
Federated Learning (FL) is a decentralized machine learning method that enables participants to collaboratively train a model without sharing their private data.
Despite its privacy and scalability benefits, FL is susceptible to backdoor attacks.
We propose DPOT, a backdoor attack strategy in FL that dynamically constructs backdoor objectives by optimizing a backdoor trigger.
arXiv Detail & Related papers (2024-05-10T02:44:25Z) - Avoid Adversarial Adaption in Federated Learning by Multi-Metric
Investigations [55.2480439325792]
Federated Learning (FL) facilitates decentralized machine learning model training, preserving data privacy, lowering communication costs, and boosting model performance through diversified data sources.
FL faces vulnerabilities such as poisoning attacks, undermining model integrity with both untargeted performance degradation and targeted backdoor attacks.
We define a new notion of strong adaptive adversaries, capable of adapting to multiple objectives simultaneously.
MESAS is the first defense robust against strong adaptive adversaries, effective in real-world data scenarios, with an average overhead of just 24.37 seconds.
arXiv Detail & Related papers (2023-06-06T11:44:42Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - Technical Report: Assisting Backdoor Federated Learning with Whole
Population Knowledge Alignment [4.87359365320076]
Single-shot backdoor attack achieves high accuracy on both the main task and backdoor sub-task when injected at the FL model convergence.
We propose a two-phase backdoor attack, which includes a preliminary phase for the subsequent backdoor attack.
Benefiting from the preliminary phase, the later injected backdoor achieves better effectiveness as the backdoor effect will be less likely to be diluted by the normal model updates.
arXiv Detail & Related papers (2022-07-25T16:38:31Z) - Neurotoxin: Durable Backdoors in Federated Learning [73.82725064553827]
federated learning systems have an inherent vulnerability during their training to adversarial backdoor attacks.
We propose Neurotoxin, a simple one-line modification to existing backdoor attacks that acts by attacking parameters that are changed less in magnitude during training.
arXiv Detail & Related papers (2022-06-12T16:52:52Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.