Does Differential Privacy Prevent Backdoor Attacks in Practice?
- URL: http://arxiv.org/abs/2311.06227v1
- Date: Fri, 10 Nov 2023 18:32:08 GMT
- Title: Does Differential Privacy Prevent Backdoor Attacks in Practice?
- Authors: Fereshteh Razmi, Jian Lou, and Li Xiong
- Abstract summary: We investigate the effectiveness of Differential Privacy techniques in preventing backdoor attacks in machine learning models.
We propose Label-DP as a faster and more accurate alternative to DP-SGD and PATE.
- Score: 8.951356689083166
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Differential Privacy (DP) was originally developed to protect privacy.
However, it has recently been utilized to secure machine learning (ML) models
from poisoning attacks, with DP-SGD receiving substantial attention.
Nevertheless, a thorough investigation is required to assess the effectiveness
of different DP techniques in preventing backdoor attacks in practice. In this
paper, we investigate the effectiveness of DP-SGD and, for the first time in
literature, examine PATE in the context of backdoor attacks. We also explore
the role of different components of DP algorithms in defending against backdoor
attacks and will show that PATE is effective against these attacks due to the
bagging structure of the teacher models it employs. Our experiments reveal that
hyperparameters and the number of backdoors in the training dataset impact the
success of DP algorithms. Additionally, we propose Label-DP as a faster and
more accurate alternative to DP-SGD and PATE. We conclude that while Label-DP
algorithms generally offer weaker privacy protection, accurate hyper-parameter
tuning can make them more effective than DP methods in defending against
backdoor attacks while maintaining model accuracy.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - IBD-PSC: Input-level Backdoor Detection via Parameter-oriented Scaling Consistency [20.61046457594186]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
This paper proposes a simple yet effective input-level backdoor detection (dubbed IBD-PSC) to filter out malicious testing images.
arXiv Detail & Related papers (2024-05-16T03:19:52Z) - Pre-training Differentially Private Models with Limited Public Data [54.943023722114134]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models.
DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage.
We develop a novel DP continual pre-training strategy using only 10% of public data.
Our strategy can achieve DP accuracy of 41.5% on ImageNet-21k, as well as non-DP accuracy of 55.7% and and 60.0% on downstream tasks Places365 and iNaturalist-2021.
arXiv Detail & Related papers (2024-02-28T23:26:27Z) - From Shortcuts to Triggers: Backdoor Defense with Denoised PoE [51.287157951953226]
Language models are often at risk of diverse backdoor attacks, especially data poisoning.
Existing backdoor defense methods mainly focus on backdoor attacks with explicit triggers.
We propose an end-to-end ensemble-based backdoor defense framework, DPoE, to defend various backdoor attacks.
arXiv Detail & Related papers (2023-05-24T08:59:25Z) - Bounding Training Data Reconstruction in DP-SGD [42.36933026300976]
Differentially private training offers a protection which is usually interpreted as a guarantee against membership inference attacks.
By proxy, this guarantee extends to other threats like reconstruction attacks attempting to extract complete training examples.
Recent works provide evidence that if one does not need to protect against membership attacks but instead only wants to protect against training data reconstruction, then utility of private models can be improved.
arXiv Detail & Related papers (2023-02-14T18:02:34Z) - CorruptEncoder: Data Poisoning based Backdoor Attacks to Contrastive
Learning [71.25518220297639]
Contrastive learning pre-trains general-purpose encoders using an unlabeled pre-training dataset.
DPBAs inject poisoned inputs into the pre-training dataset so the encoder is backdoored.
CorruptEncoder introduces a new attack strategy to create poisoned inputs and uses a theory-guided method to maximize attack effectiveness.
Our results show that our defense can reduce the effectiveness of DPBAs, but it sacrifices the utility of the encoder, highlighting the need for new defenses.
arXiv Detail & Related papers (2022-11-15T15:48:28Z) - DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with
Differentially Private Data Augmentations [54.960853673256]
We show that strong data augmentations, such as mixup and random additive noise, nullify poison attacks while enduring only a small accuracy trade-off.
A rigorous analysis of DP-InstaHide shows that mixup does indeed have privacy advantages, and that training with k-way mixup provably yields at least k times stronger DP guarantees than a naive DP mechanism.
arXiv Detail & Related papers (2021-03-02T23:07:31Z) - Monitoring-based Differential Privacy Mechanism Against Query-Flooding
Parameter Duplication Attack [15.977216274894912]
We propose an adaptive query-flooding parameter duplication (QPD) attack.
The adversary can infer the model information with black-box access.
We develop a defense strategy using monitoring-based DP (MDP) against this new attack.
arXiv Detail & Related papers (2020-11-01T04:21:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.