Related papers: DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations

DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations

URL: http://arxiv.org/abs/2103.02079v1
Date: Tue, 2 Mar 2021 23:07:31 GMT
Title: DP-InstaHide: Provably Defusing Poisoning and Backdoor Attacks with Differentially Private Data Augmentations
Authors: Eitan Borgnia, Jonas Geiping, Valeriia Cherepanova, Liam Fowl, Arjun Gupta, Amin Ghiasi, Furong Huang, Micah Goldblum, Tom Goldstein
Abstract summary: We show that strong data augmentations, such as mixup and random additive noise, nullify poison attacks while enduring only a small accuracy trade-off. A rigorous analysis of DP-InstaHide shows that mixup does indeed have privacy advantages, and that training with k-way mixup provably yields at least k times stronger DP guarantees than a naive DP mechanism.
Score: 54.960853673256
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Data poisoning and backdoor attacks manipulate training data to induce security breaches in a victim model. These attacks can be provably deflected using differentially private (DP) training methods, although this comes with a sharp decrease in model performance. The InstaHide method has recently been proposed as an alternative to DP training that leverages supposed privacy properties of the mixup augmentation, although without rigorous guarantees. In this work, we show that strong data augmentations, such as mixup and random additive noise, nullify poison attacks while enduring only a small accuracy trade-off. To explain these finding, we propose a training method, DP-InstaHide, which combines the mixup regularizer with additive noise. A rigorous analysis of DP-InstaHide shows that mixup does indeed have privacy advantages, and that training with k-way mixup provably yields at least k times stronger DP guarantees than a naive DP mechanism. Because mixup (as opposed to noise) is beneficial to model performance, DP-InstaHide provides a mechanism for achieving stronger empirical performance against poisoning attacks than other known DP methods.

Related papers

Beyond the Worst Case: Extending Differential Privacy Guarantees to Realistic Adversaries [17.780319275883127]
Differential Privacy is a family of definitions that bound the worst-case privacy leakage of a mechanism.<n>This work sheds light on what the worst-case guarantee of DP implies about the success of attackers that are more representative of real-world privacy risks.
arXiv Detail & Related papers (2025-07-10T20:36:31Z)
To Shuffle or not to Shuffle: Auditing DP-SGD with Shuffling [25.669347036509134]
We analyze Differentially Private Gradient Descent (DP-SGD) with shuffling. We show that state-of-the-art DP models trained with shuffling appreciably overestimated privacy guarantees (up to 4x) Our work empirically attests to the risk of using shuffling instead of Poisson sub-sampling vis-a-vis the actual privacy leakage of DP-SGD.
arXiv Detail & Related papers (2024-11-15T22:34:28Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
Rethinking Improved Privacy-Utility Trade-off with Pre-existing Knowledge for DP Training [31.559864332056648]
We propose a generic differential privacy framework with heterogeneous noise (DP-Hero) Atop DP-Hero, we instantiate a heterogeneous version of DP-SGD, where the noise injected into gradient updates is heterogeneous and guided by prior-established model parameters. We conduct comprehensive experiments to verify and explain the effectiveness of the proposed DP-Hero, showing improved training accuracy compared with state-of-the-art works.
arXiv Detail & Related papers (2024-09-05T08:40:54Z)
Incentives in Private Collaborative Machine Learning [56.84263918489519]
Collaborative machine learning involves training models on data from multiple parties. We introduce differential privacy (DP) as an incentive. We empirically demonstrate the effectiveness and practicality of our approach on synthetic and real-world datasets.
arXiv Detail & Related papers (2024-04-02T06:28:22Z)
Pre-training Differentially Private Models with Limited Public Data [54.943023722114134]
differential privacy (DP) is a prominent method to gauge the degree of security provided to the models. DP is yet not capable of protecting a substantial portion of the data used during the initial pre-training stage. We develop a novel DP continual pre-training strategy using only 10% of public data. Our strategy can achieve DP accuracy of 41.5% on ImageNet-21k, as well as non-DP accuracy of 55.7% and and 60.0% on downstream tasks Places365 and iNaturalist-2021.
arXiv Detail & Related papers (2024-02-28T23:26:27Z)
Closed-Form Bounds for DP-SGD against Record-level Inference [18.85865832127335]
We focus on the popular DP-SGD algorithm, and derive simple closed-form bounds. We obtain bounds for membership inference that match state-of-the-art techniques. We present a novel data-dependent bound against attribute inference.
arXiv Detail & Related papers (2024-02-22T09:26:16Z)
Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data. We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z)
Does Differential Privacy Prevent Backdoor Attacks in Practice? [8.951356689083166]
We investigate the effectiveness of Differential Privacy techniques in preventing backdoor attacks in machine learning models. We propose Label-DP as a faster and more accurate alternative to DP-SGD and PATE.
arXiv Detail & Related papers (2023-11-10T18:32:08Z)
Bounding Training Data Reconstruction in DP-SGD [42.36933026300976]
Differentially private training offers a protection which is usually interpreted as a guarantee against membership inference attacks. By proxy, this guarantee extends to other threats like reconstruction attacks attempting to extract complete training examples. Recent works provide evidence that if one does not need to protect against membership attacks but instead only wants to protect against training data reconstruction, then utility of private models can be improved.
arXiv Detail & Related papers (2023-02-14T18:02:34Z)
FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning [66.56240101249803]
We study how hardening benign clients can affect the global model (and the malicious clients) We propose a trigger reverse engineering based defense and show that our method can achieve improvement with guarantee robustness. Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks.
arXiv Detail & Related papers (2022-10-23T22:24:03Z)
Combining Stochastic Defenses to Resist Gradient Inversion: An Ablation Study [6.766058964358335]
Common defense mechanisms such as Differential Privacy (DP) or Privacy Modules (PMs) introduce randomness during computation to prevent such attacks. This paper introduces several targeted GI attacks that leverage this principle to bypass common defense mechanisms.
arXiv Detail & Related papers (2022-08-09T13:23:29Z)
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff [57.35978884015093]
We show that strong data augmentations, such as CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance. In the context of backdoors, CutMix greatly mitigates the attack while simultaneously increasing validation accuracy by 9%.
arXiv Detail & Related papers (2020-11-18T20:18:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.