MetaPoison: Practical General-purpose Clean-label Data Poisoning
- URL: http://arxiv.org/abs/2004.00225v2
- Date: Sun, 21 Feb 2021 02:40:40 GMT
- Title: MetaPoison: Practical General-purpose Clean-label Data Poisoning
- Authors: W. Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, Tom Goldstein
- Abstract summary: Data poisoning is an emerging threat in the context of neural networks.
We propose MetaPoison, a first-order method that approximates the bilevel problem via meta-learning and crafts poisons that fool neural networks.
We demonstrate for the first time successful data poisoning of models trained on the black-box Google Cloud AutoML API.
- Score: 58.13959698513719
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data poisoning -- the process by which an attacker takes control of a model
by making imperceptible changes to a subset of the training data -- is an
emerging threat in the context of neural networks. Existing attacks for data
poisoning neural networks have relied on hand-crafted heuristics, because
solving the poisoning problem directly via bilevel optimization is generally
thought of as intractable for deep models. We propose MetaPoison, a first-order
method that approximates the bilevel problem via meta-learning and crafts
poisons that fool neural networks. MetaPoison is effective: it outperforms
previous clean-label poisoning methods by a large margin. MetaPoison is robust:
poisoned data made for one model transfer to a variety of victim models with
unknown training settings and architectures. MetaPoison is general-purpose, it
works not only in fine-tuning scenarios, but also for end-to-end training from
scratch, which till now hasn't been feasible for clean-label attacks with deep
nets. MetaPoison can achieve arbitrary adversary goals -- like using poisons of
one class to make a target image don the label of another arbitrarily chosen
class. Finally, MetaPoison works in the real-world. We demonstrate for the
first time successful data poisoning of models trained on the black-box Google
Cloud AutoML API. Code and premade poisons are provided at
https://github.com/wronnyhuang/metapoison
Related papers
- On the Exploitability of Instruction Tuning [103.8077787502381]
In this work, we investigate how an adversary can exploit instruction tuning to change a model's behavior.
We propose textitAutoPoison, an automated data poisoning pipeline.
Our results show that AutoPoison allows an adversary to change a model's behavior by poisoning only a small fraction of data.
arXiv Detail & Related papers (2023-06-28T17:54:04Z) - Poison Attack and Defense on Deep Source Code Processing Models [38.32413592143839]
We present a poison attack framework for source code named CodePoisoner as a strong imaginary enemy.
CodePoisoner can produce compilable even human-imperceptible poison samples and attack models by poisoning the training data.
We propose an effective defense approach named CodeDetector to detect poison samples in the training data.
arXiv Detail & Related papers (2022-10-31T03:06:40Z) - Indiscriminate Data Poisoning Attacks on Neural Networks [28.09519873656809]
Data poisoning attacks aim to influence a model by injecting "poisoned" data into the training process.
We take a closer look at existing poisoning attacks and connect them with old and new algorithms for solving sequential Stackelberg games.
We present efficient implementations that exploit modern auto-differentiation packages and allow simultaneous and coordinated generation of poisoned points.
arXiv Detail & Related papers (2022-04-19T18:57:26Z) - Adversarial Examples Make Strong Poisons [55.63469396785909]
We show that adversarial examples, originally intended for attacking pre-trained models, are even more effective for data poisoning than recent methods designed specifically for poisoning.
Our method, adversarial poisoning, is substantially more effective than existing poisoning methods for secure dataset release.
arXiv Detail & Related papers (2021-06-21T01:57:14Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - De-Pois: An Attack-Agnostic Defense against Data Poisoning Attacks [17.646155241759743]
De-Pois is an attack-agnostic defense against poisoning attacks.
We implement four types of poisoning attacks and evaluate De-Pois with five typical defense methods.
arXiv Detail & Related papers (2021-05-08T04:47:37Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and
Data Poisoning Attacks [74.88735178536159]
Data poisoning is the number one concern among threats ranging from model stealing to adversarial attacks.
We observe that data poisoning and backdoor attacks are highly sensitive to variations in the testing setup.
We apply rigorous tests to determine the extent to which we should fear them.
arXiv Detail & Related papers (2020-06-22T18:34:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.