Provable Defense Against Delusive Poisoning
- URL: http://arxiv.org/abs/2102.04716v1
- Date: Tue, 9 Feb 2021 09:19:47 GMT
- Title: Provable Defense Against Delusive Poisoning
- Authors: Lue Tao, Lei Feng, Jinfeng Yi, Sheng-Jun Huang, Songcan Chen
- Abstract summary: We show that adversarial training can be a principled defense method against delusive poisoning.
This implies that adversarial training can be a principled defense method against delusive poisoning.
- Score: 64.69220849669948
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Delusive poisoning is a special kind of attack to obstruct learning, where
the learning performance could be significantly deteriorated by only
manipulating (even slightly) the features of correctly labeled training
examples. By formalizing this malicious attack as finding the worst-case
distribution shift at training time within a specific $\infty$-Wasserstein
ball, we show that minimizing adversarial risk on the poison data is equivalent
to optimizing an upper bound of natural risk on the original data. This implies
that adversarial training can be a principled defense method against delusive
poisoning. To further understand the internal mechanism of the defense, we
disclose that adversarial training can resist the training distribution shift
by preventing the learner from overly relying on non-robust features in a
natural setting. Finally, we complement our theoretical findings with a set of
experiments on popular benchmark datasets, which shows that the defense
withstands six different practical attacks. Both theoretical and empirical
results vote for adversarial training when confronted with delusive poisoning.
Related papers
- Test-time Adversarial Defense with Opposite Adversarial Path and High Attack Time Cost [5.197034517903854]
We investigate a new test-time adversarial defense method via diffusion-based recovery along opposite adversarial paths (OAPs)
We present a purifier that can be plugged into a pre-trained model to resist adversarial attacks.
arXiv Detail & Related papers (2024-10-22T08:32:17Z) - PACOL: Poisoning Attacks Against Continual Learners [1.569413950416037]
In this work, we demonstrate that continual learning systems can be manipulated by malicious misinformation.
We present a new category of data poisoning attacks specific for continual learners, which we refer to as em Poisoning Attacks Against Continual learners (PACOL)
A comprehensive set of experiments shows the vulnerability of commonly used generative replay and regularization-based continual learning approaches against attack methods.
arXiv Detail & Related papers (2023-11-18T00:20:57Z) - Transferable Availability Poisoning Attacks [23.241524904589326]
We consider availability data poisoning attacks, where an adversary aims to degrade the overall test accuracy of a machine learning model.
Existing poisoning strategies can achieve the attack goal but assume the victim to employ the same learning method as what the adversary uses to mount the attack.
We propose Transferable Poisoning, which first leverages the intrinsic characteristics of alignment and uniformity to enable better unlearnability.
arXiv Detail & Related papers (2023-10-08T12:22:50Z) - On Practical Aspects of Aggregation Defenses against Data Poisoning
Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning.
Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness.
Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z) - Adversarial Examples Make Strong Poisons [55.63469396785909]
We show that adversarial examples, originally intended for attacking pre-trained models, are even more effective for data poisoning than recent methods designed specifically for poisoning.
Our method, adversarial poisoning, is substantially more effective than existing poisoning methods for secure dataset release.
arXiv Detail & Related papers (2021-06-21T01:57:14Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - What Doesn't Kill You Makes You Robust(er): Adversarial Training against
Poisons and Backdoors [57.040948169155925]
We extend the adversarial training framework to defend against (training-time) poisoning and backdoor attacks.
Our method desensitizes networks to the effects of poisoning by creating poisons during training and injecting them into training batches.
We show that this defense withstands adaptive attacks, generalizes to diverse threat models, and incurs a better performance trade-off than previous defenses.
arXiv Detail & Related papers (2021-02-26T17:54:36Z) - A Separation Result Between Data-oblivious and Data-aware Poisoning
Attacks [40.044030156696145]
Poisoning attacks have emerged as a significant security threat to machine learning algorithms.
Some of the stronger poisoning attacks require the full knowledge of the training data.
We show that full-information adversaries are provably stronger than the optimal attacker.
arXiv Detail & Related papers (2020-03-26T16:40:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.