Indiscriminate Poisoning Attacks Are Shortcuts
- URL: http://arxiv.org/abs/2111.00898v1
- Date: Mon, 1 Nov 2021 12:44:26 GMT
- Title: Indiscriminate Poisoning Attacks Are Shortcuts
- Authors: Da Yu, Huishuai Zhang, Wei Chen, Jian Yin, Tie-Yan Liu
- Abstract summary: We find that the perturbations of advanced poisoning attacks are almost textbflinear separable when assigned with the target labels of the corresponding samples.
We show that such synthetic perturbations are as powerful as the deliberately crafted attacks.
Our finding suggests that the emphshortcut learning problem is more serious than previously believed.
- Score: 77.38947817228656
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Indiscriminate data poisoning attacks, which add imperceptible perturbations
to training data to maximize the test error of trained models, have become a
trendy topic because they are thought to be capable of preventing unauthorized
use of data. In this work, we investigate why these perturbations work in
principle. We find that the perturbations of advanced poisoning attacks are
almost \textbf{linear separable} when assigned with the target labels of the
corresponding samples, which hence can work as \emph{shortcuts} for the
learning objective. This important population property has not been unveiled
before. Moreover, we further verify that linear separability is indeed the
workhorse for poisoning attacks. We synthesize linear separable data as
perturbations and show that such synthetic perturbations are as powerful as the
deliberately crafted attacks. Our finding suggests that the \emph{shortcut
learning} problem is more serious than previously believed as deep learning
heavily relies on shortcuts even if they are of an imperceptible scale and
mixed together with the normal features. This finding also suggests that
pre-trained feature extractors would disable these poisoning attacks
effectively.
Related papers
- Deferred Poisoning: Making the Model More Vulnerable via Hessian Singularization [39.37308843208039]
We introduce a more threatening type of poisoning attack called the Deferred Poisoning Attack.
This new attack allows the model to function normally during the training and validation phases but makes it very sensitive to evasion attacks or even natural noise.
We have conducted both theoretical and empirical analyses of the proposed method and validated its effectiveness through experiments on image classification tasks.
arXiv Detail & Related papers (2024-11-06T08:27:49Z) - How adversarial attacks can disrupt seemingly stable accurate classifiers [76.95145661711514]
Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data.
Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data.
We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability.
arXiv Detail & Related papers (2023-09-07T12:02:00Z) - What Distributions are Robust to Indiscriminate Poisoning Attacks for
Linear Learners? [15.848311379119295]
We study indiscriminate poisoning for linear learners where an adversary injects a few crafted examples into the training data with the goal of forcing the induced model to incur higher test error.
Inspired by the observation that linear learners on some datasets are able to resist the best known attacks even without any defenses, we investigate whether datasets can be inherently robust to indiscriminate poisoning attacks for linear learners.
arXiv Detail & Related papers (2023-07-03T14:54:13Z) - Adversarial Attacks are a Surprisingly Strong Baseline for Poisoning
Few-Shot Meta-Learners [28.468089304148453]
We attack amortized meta-learners, which allows us to craft colluding sets of inputs that fool the system's learning algorithm.
We show that in a white box setting, these attacks are very successful and can cause the target model's predictions to become worse than chance.
We explore two hypotheses to explain this: 'overfitting' by the attack, and mismatch between the model on which the attack is generated and that to which the attack is transferred.
arXiv Detail & Related papers (2022-11-23T14:55:44Z) - Adversarial Examples Make Strong Poisons [55.63469396785909]
We show that adversarial examples, originally intended for attacking pre-trained models, are even more effective for data poisoning than recent methods designed specifically for poisoning.
Our method, adversarial poisoning, is substantially more effective than existing poisoning methods for secure dataset release.
arXiv Detail & Related papers (2021-06-21T01:57:14Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - Provable Defense Against Delusive Poisoning [64.69220849669948]
We show that adversarial training can be a principled defense method against delusive poisoning.
This implies that adversarial training can be a principled defense method against delusive poisoning.
arXiv Detail & Related papers (2021-02-09T09:19:47Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - A Separation Result Between Data-oblivious and Data-aware Poisoning
Attacks [40.044030156696145]
Poisoning attacks have emerged as a significant security threat to machine learning algorithms.
Some of the stronger poisoning attacks require the full knowledge of the training data.
We show that full-information adversaries are provably stronger than the optimal attacker.
arXiv Detail & Related papers (2020-03-26T16:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.