Model-Targeted Poisoning Attacks with Provable Convergence
- URL: http://arxiv.org/abs/2006.16469v2
- Date: Wed, 21 Apr 2021 13:40:37 GMT
- Title: Model-Targeted Poisoning Attacks with Provable Convergence
- Authors: Fnu Suya, Saeed Mahloujifar, Anshuman Suri, David Evans, Yuan Tian
- Abstract summary: In a poisoning attack, an adversary with control over a small fraction of the training data attempts to select that data in a way that induces a corrupted model.
We consider poisoning attacks against convex machine learning models and propose an efficient poisoning attack designed to induce a specified model.
- Score: 19.196295769662186
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In a poisoning attack, an adversary with control over a small fraction of the
training data attempts to select that data in a way that induces a corrupted
model that misbehaves in favor of the adversary. We consider poisoning attacks
against convex machine learning models and propose an efficient poisoning
attack designed to induce a specified model. Unlike previous model-targeted
poisoning attacks, our attack comes with provable convergence to {\it any}
attainable target classifier. The distance from the induced classifier to the
target classifier is inversely proportional to the square root of the number of
poisoning points. We also provide a lower bound on the minimum number of
poisoning points needed to achieve a given target classifier. Our method uses
online convex optimization, so finds poisoning points incrementally. This
provides more flexibility than previous attacks which require a priori
assumption about the number of poisoning points. Our attack is the first
model-targeted poisoning attack that provides provable convergence for convex
models, and in our experiments, it either exceeds or matches state-of-the-art
attacks in terms of attack success rate and distance to the target model.
Related papers
- Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization [39.37308843208039]
We introduce a more threatening type of poisoning attack called the Deferred Poisoning Attack.
This new attack allows the model to function normally during the training and validation phases but makes it very sensitive to evasion attacks or even natural noise.
We have conducted both theoretical and empirical analyses of the proposed method and validated its effectiveness through experiments on image classification tasks.
arXiv Detail & Related papers (2024-11-06T08:27:49Z) - Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data.
We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z) - FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning
Attacks [31.339252233416477]
We introduce the notion of model poisoning reachability as a technical tool to explore the intrinsic limits of data poisoning attacks towards target parameters.
We derive an easily computable threshold to establish and quantify a surprising phase transition phenomenon among popular ML models.
Our work highlights the critical role played by the poisoning ratio, and sheds new insights on existing empirical results, attacks and mitigation strategies in data poisoning.
arXiv Detail & Related papers (2023-03-07T01:55:26Z) - De-Pois: An Attack-Agnostic Defense against Data Poisoning Attacks [17.646155241759743]
De-Pois is an attack-agnostic defense against poisoning attacks.
We implement four types of poisoning attacks and evaluate De-Pois with five typical defense methods.
arXiv Detail & Related papers (2021-05-08T04:47:37Z) - DeepPoison: Feature Transfer Based Stealthy Poisoning Attack [2.1445455835823624]
DeepPoison is a novel adversarial network of one generator and two discriminators.
DeepPoison can achieve a state-of-the-art attack success rate, as high as 91.74%.
arXiv Detail & Related papers (2021-01-06T15:45:36Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - Poisoning Attacks on Algorithmic Fairness [14.213638219685656]
We introduce an optimization framework for poisoning attacks against algorithmic fairness.
We develop a gradient-based poisoning attack aimed at introducing classification disparities among different groups in the data.
We believe that our findings pave the way towards the definition of an entirely novel set of adversarial attacks targeting algorithmic fairness in different scenarios.
arXiv Detail & Related papers (2020-04-15T08:07:01Z) - Adversarial Imitation Attack [63.76805962712481]
A practical adversarial attack should require as little as possible knowledge of attacked models.
Current substitute attacks need pre-trained models to generate adversarial examples.
In this study, we propose a novel adversarial imitation attack.
arXiv Detail & Related papers (2020-03-28T10:02:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.