Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved
Transferability
- URL: http://arxiv.org/abs/2005.00191v3
- Date: Sun, 14 Mar 2021 01:33:47 GMT
- Title: Bullseye Polytope: A Scalable Clean-Label Poisoning Attack with Improved
Transferability
- Authors: Hojjat Aghakhani, Dongyu Meng, Yu-Xiang Wang, Christopher Kruegel, and
Giovanni Vigna
- Abstract summary: A recent source of concern for neural networks is the emergence of clean-label dataset poisoning attacks.
We propose a scalable and transferable clean-label poisoning attack against transfer learning.
Our attack, Bullseye Polytope, improves the attack success rate of the current state-of-the-art by 26.75% in end-to-end transfer learning.
- Score: 23.031305862548226
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A recent source of concern for the security of neural networks is the
emergence of clean-label dataset poisoning attacks, wherein correctly labeled
poison samples are injected into the training dataset. While these poison
samples look legitimate to the human observer, they contain malicious
characteristics that trigger a targeted misclassification during inference. We
propose a scalable and transferable clean-label poisoning attack against
transfer learning, which creates poison images with their center close to the
target image in the feature space. Our attack, Bullseye Polytope, improves the
attack success rate of the current state-of-the-art by 26.75% in end-to-end
transfer learning, while increasing attack speed by a factor of 12. We further
extend Bullseye Polytope to a more practical attack model by including multiple
images of the same object (e.g., from different angles) when crafting the
poison samples. We demonstrate that this extension improves attack
transferability by over 16% to unseen images (of the same object) without using
extra poison samples.
Related papers
- SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - Diffusion Denoising as a Certified Defense against Clean-label Poisoning [56.04951180983087]
We show how an off-the-shelf diffusion model can sanitize the tampered training data.
We extensively test our defense against seven clean-label poisoning attacks and reduce their attack success to 0-16% with only a negligible drop in the test time accuracy.
arXiv Detail & Related papers (2024-03-18T17:17:07Z) - Impart: An Imperceptible and Effective Label-Specific Backdoor Attack [15.859650783567103]
We propose a novel imperceptible backdoor attack framework, named Impart, in the scenario where the attacker has no access to the victim model.
Specifically, in order to enhance the attack capability of the all-to-all setting, we first propose a label-specific attack.
arXiv Detail & Related papers (2024-03-18T07:22:56Z) - Nightshade: Prompt-Specific Poisoning Attacks on Text-to-Image Generative Models [26.301156075883483]
We show that poisoning attacks can be successful on generative models.
We introduce Nightshade, an optimized prompt-specific poisoning attack.
We show that Nightshade attacks can destabilize general features in a text-to-image generative model.
arXiv Detail & Related papers (2023-10-20T21:54:10Z) - Just Rotate it: Deploying Backdoor Attacks via Rotation Transformation [48.238349062995916]
We find that highly effective backdoors can be easily inserted using rotation-based image transformation.
Our work highlights a new, simple, physically realizable, and highly effective vector for backdoor attacks.
arXiv Detail & Related papers (2022-07-22T00:21:18Z) - Poison Ink: Robust and Invisible Backdoor Attack [122.49388230821654]
We propose a robust and invisible backdoor attack called Poison Ink''
Concretely, we first leverage the image structures as target poisoning areas, and fill them with poison ink (information) to generate the trigger pattern.
Compared to existing popular backdoor attack methods, Poison Ink outperforms both in stealthiness and robustness.
arXiv Detail & Related papers (2021-08-05T09:52:49Z) - Adversarial Examples Make Strong Poisons [55.63469396785909]
We show that adversarial examples, originally intended for attacking pre-trained models, are even more effective for data poisoning than recent methods designed specifically for poisoning.
Our method, adversarial poisoning, is substantially more effective than existing poisoning methods for secure dataset release.
arXiv Detail & Related papers (2021-06-21T01:57:14Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - Backdooring and Poisoning Neural Networks with Image-Scaling Attacks [15.807243762876901]
We propose a novel strategy for hiding backdoor and poisoning attacks.
Our approach builds on a recent class of attacks against image scaling.
We show that backdoors and poisoning work equally well when combined with image-scaling attacks.
arXiv Detail & Related papers (2020-03-19T08:59:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.