Mitigating Backdoor Attacks in Federated Learning
- URL: http://arxiv.org/abs/2011.01767v2
- Date: Thu, 14 Jan 2021 19:53:17 GMT
- Title: Mitigating Backdoor Attacks in Federated Learning
- Authors: Chen Wu, Xian Yang, Sencun Zhu, Prasenjit Mitra
- Abstract summary: We propose a new and effective method to mitigate backdoor attacks after the training phase.
Specifically, we design a federated pruning method to remove redundant neurons in the network and then adjust the model's extreme weight values.
Experiments conducted on distributed Fashion-MNIST show that our method can reduce the average attack success rate from 99.7% to 1.9% with a 5.5% loss of test accuracy.
- Score: 9.582197388445204
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Malicious clients can attack federated learning systems using malicious data,
including backdoor samples, during the training phase. The compromised global
model will perform well on the validation dataset designed for the task, but a
small subset of data with backdoor patterns may trigger the model to make a
wrong prediction. There has been an arms race between attackers who tried to
conceal attacks and defenders who tried to detect attacks during the
aggregation stage of training on the server-side. In this work, we propose a
new and effective method to mitigate backdoor attacks after the training phase.
Specifically, we design a federated pruning method to remove redundant neurons
in the network and then adjust the model's extreme weight values. Our
experiments conducted on distributed Fashion-MNIST show that our method can
reduce the average attack success rate from 99.7% to 1.9% with a 5.5% loss of
test accuracy on the validation dataset. To minimize the pruning influence on
test accuracy, we can fine-tune after pruning, and the attack success rate
drops to 6.4%, with only a 1.7% loss of test accuracy. Further experiments
under Distributed Backdoor Attacks on CIFAR-10 also show promising results that
the average attack success rate drops more than 70% with less than 2% loss of
test accuracy on the validation dataset.
Related papers
- Filter, Obstruct and Dilute: Defending Against Backdoor Attacks on Semi-Supervised Learning [29.65600202138321]
Recent studies have verified that semi-supervised learning (SSL) is vulnerable to data poisoning backdoor attacks.
This work aims to protect SSL against such risks, marking it as one of the few known efforts in this area.
arXiv Detail & Related papers (2025-02-09T03:22:15Z) - Persistent Pre-Training Poisoning of LLMs [71.53046642099142]
Our work evaluates for the first time whether language models can also be compromised during pre-training.
We pre-train a series of LLMs from scratch to measure the impact of a potential poisoning adversary.
Our main result is that poisoning only 0.1% of a model's pre-training dataset is sufficient for three out of four attacks to persist through post-training.
arXiv Detail & Related papers (2024-10-17T16:27:13Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Robust Federated Learning Mitigates Client-side Training Data Distribution Inference Attacks [48.70867241987739]
InferGuard is a novel Byzantine-robust aggregation rule aimed at defending against client-side training data distribution inference attacks.
The results of our experiments indicate that our defense mechanism is highly effective in protecting against client-side training data distribution inference attacks.
arXiv Detail & Related papers (2024-03-05T17:41:35Z) - Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data.
We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z) - Does Few-shot Learning Suffer from Backdoor Attacks? [63.9864247424967]
We show that few-shot learning can still be vulnerable to backdoor attacks.
Our method demonstrates a high Attack Success Rate (ASR) in FSL tasks with different few-shot learning paradigms.
This study reveals that few-shot learning still suffers from backdoor attacks, and its security should be given attention.
arXiv Detail & Related papers (2023-12-31T06:43:36Z) - DALA: A Distribution-Aware LoRA-Based Adversarial Attack against
Language Models [64.79319733514266]
Adversarial attacks can introduce subtle perturbations to input data.
Recent attack methods can achieve a relatively high attack success rate (ASR)
We propose a Distribution-Aware LoRA-based Adversarial Attack (DALA) method.
arXiv Detail & Related papers (2023-11-14T23:43:47Z) - Towards Understanding How Self-training Tolerates Data Backdoor
Poisoning [11.817302291033725]
We explore the potential of self-training via additional unlabeled data for mitigating backdoor attacks.
We find that the new self-training regime help in defending against backdoor attacks to a great extent.
arXiv Detail & Related papers (2023-01-20T16:36:45Z) - FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated
Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients)
We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness.
Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z) - Efficient Adversarial Training With Data Pruning [26.842714298874192]
We show that data pruning leads to improvements in convergence and reliability of adversarial training.
In some settings data pruning brings benefits from both worlds-it both improves adversarial accuracy and training time.
arXiv Detail & Related papers (2022-07-01T23:54:46Z) - DAD: Data-free Adversarial Defense at Test Time [21.741026088202126]
Deep models are highly susceptible to adversarial attacks.
Privacy has become an important concern, restricting access to only trained models but not the training data.
We propose a completely novel problem of 'test-time adversarial defense in absence of training data and even their statistics'
arXiv Detail & Related papers (2022-04-04T15:16:13Z) - BaFFLe: Backdoor detection via Feedback-based Federated Learning [3.6895394817068357]
We propose Backdoor detection via Feedback-based Federated Learning (BAFFLE)
We show that BAFFLE reliably detects state-of-the-art backdoor attacks with a detection accuracy of 100% and a false-positive rate below 5%.
arXiv Detail & Related papers (2020-11-04T07:44:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.