Related papers: Sharpness-Aware Data Poisoning Attack

Sharpness-Aware Data Poisoning Attack

URL: http://arxiv.org/abs/2305.14851v2
Date: Tue, 7 May 2024 04:41:52 GMT
Title: Sharpness-Aware Data Poisoning Attack
Authors: Pengfei He, Han Xu, Jie Ren, Yingqian Cui, Hui Liu, Charu C. Aggarwal, Jiliang Tang,
Abstract summary: Recent research has highlighted the vulnerability of Deep Neural Networks (DNNs) against data poisoning attacks. We propose a novel attack method called ''Sharpness-Aware Data Poisoning Attack (SAPA)'' In particular, it leverages the concept of DNNs' loss landscape sharpness to optimize the poisoning effect on the worst re-trained model.
Score: 38.01535347191942
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent research has highlighted the vulnerability of Deep Neural Networks (DNNs) against data poisoning attacks. These attacks aim to inject poisoning samples into the models' training dataset such that the trained models have inference failures. While previous studies have executed different types of attacks, one major challenge that greatly limits their effectiveness is the uncertainty of the re-training process after the injection of poisoning samples, including the re-training initialization or algorithms. To address this challenge, we propose a novel attack method called ''Sharpness-Aware Data Poisoning Attack (SAPA)''. In particular, it leverages the concept of DNNs' loss landscape sharpness to optimize the poisoning effect on the worst re-trained model. It helps enhance the preservation of the poisoning effect, regardless of the specific retraining procedure employed. Extensive experiments demonstrate that SAPA offers a general and principled strategy that significantly enhances various types of poisoning attacks.

Related papers

SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources. Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker. Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z)
HINT: Healthy Influential-Noise based Training to Defend against Data Poisoning Attacks [12.929357709840975]
We propose an efficient and robust training approach to defend against data poisoning attacks based on influence functions. Using influence functions, we craft healthy noise that helps to harden the classification model against poisoning attacks. Our empirical results show that HINT can efficiently protect deep learning models against the effect of both untargeted and targeted poisoning attacks.
arXiv Detail & Related papers (2023-09-15T17:12:19Z)
Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information. By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples. We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z)
Denoising Autoencoder-based Defensive Distillation as an Adversarial Robustness Algorithm [0.0]
Adversarial attacks significantly threaten the robustness of deep neural networks (DNNs) This work proposes a novel method that combines the defensive distillation mechanism with a denoising autoencoder (DAE)
arXiv Detail & Related papers (2023-03-28T11:34:54Z)
Exploring the Limits of Model-Targeted Indiscriminate Data Poisoning Attacks [31.339252233416477]
We introduce the notion of model poisoning reachability as a technical tool to explore the intrinsic limits of data poisoning attacks towards target parameters. We derive an easily computable threshold to establish and quantify a surprising phase transition phenomenon among popular ML models. Our work highlights the critical role played by the poisoning ratio, and sheds new insights on existing empirical results, attacks and mitigation strategies in data poisoning.
arXiv Detail & Related papers (2023-03-07T01:55:26Z)
Indiscriminate Data Poisoning Attacks on Neural Networks [28.09519873656809]
Data poisoning attacks aim to influence a model by injecting "poisoned" data into the training process. We take a closer look at existing poisoning attacks and connect them with old and new algorithms for solving sequential Stackelberg games. We present efficient implementations that exploit modern auto-differentiation packages and allow simultaneous and coordinated generation of poisoned points.
arXiv Detail & Related papers (2022-04-19T18:57:26Z)
Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects. Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z)
How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality. We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers. Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z)
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data. We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label" We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.