DeepPoison: Feature Transfer Based Stealthy Poisoning Attack
- URL: http://arxiv.org/abs/2101.02562v2
- Date: Fri, 8 Jan 2021 04:07:19 GMT
- Title: DeepPoison: Feature Transfer Based Stealthy Poisoning Attack
- Authors: Jinyin Chen, Longyuan Zhang, Haibin Zheng, Xueke Wang and Zhaoyan Ming
- Abstract summary: DeepPoison is a novel adversarial network of one generator and two discriminators.
DeepPoison can achieve a state-of-the-art attack success rate, as high as 91.74%.
- Score: 2.1445455835823624
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep neural networks are susceptible to poisoning attacks by purposely
polluted training data with specific triggers. As existing episodes mainly
focused on attack success rate with patch-based samples, defense algorithms can
easily detect these poisoning samples. We propose DeepPoison, a novel
adversarial network of one generator and two discriminators, to address this
problem. Specifically, the generator automatically extracts the target class'
hidden features and embeds them into benign training samples. One discriminator
controls the ratio of the poisoning perturbation. The other discriminator works
as the target model to testify the poisoning effects. The novelty of DeepPoison
lies in that the generated poisoned training samples are indistinguishable from
the benign ones by both defensive methods and manual visual inspection, and
even benign test samples can achieve the attack. Extensive experiments have
shown that DeepPoison can achieve a state-of-the-art attack success rate, as
high as 91.74%, with only 7% poisoned samples on publicly available datasets
LFW and CASIA. Furthermore, we have experimented with high-performance defense
algorithms such as autodecoder defense and DBSCAN cluster detection and showed
the resilience of DeepPoison.
Related papers
- The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data [4.9676716806872125]
backdoor attacks have posed a serious security threat to the training process of deep neural networks (DNNs)
We propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples.
Our framework is effective in preventing backdoor injection and robust to various attacks while maintaining the performance on benign samples.
arXiv Detail & Related papers (2024-04-17T11:15:58Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - PiDAn: A Coherence Optimization Approach for Backdoor Attack Detection
and Mitigation in Deep Neural Networks [22.900501880865658]
Backdoor attacks impose a new threat in Deep Neural Networks (DNNs)
We propose PiDAn, an algorithm based on coherence optimization purifying the poisoned data.
Our PiDAn algorithm can detect more than 90% infected classes and identify 95% poisoned samples.
arXiv Detail & Related papers (2022-03-17T12:37:21Z) - Learning and Certification under Instance-targeted Poisoning [49.55596073963654]
We study PAC learnability and certification under instance-targeted poisoning attacks.
We show that when the budget of the adversary scales sublinearly with the sample complexity, PAC learnability and certification are achievable.
We empirically study the robustness of K nearest neighbour, logistic regression, multi-layer perceptron, and convolutional neural network on real data sets.
arXiv Detail & Related papers (2021-05-18T17:48:15Z) - De-Pois: An Attack-Agnostic Defense against Data Poisoning Attacks [17.646155241759743]
De-Pois is an attack-agnostic defense against poisoning attacks.
We implement four types of poisoning attacks and evaluate De-Pois with five typical defense methods.
arXiv Detail & Related papers (2021-05-08T04:47:37Z) - How Robust are Randomized Smoothing based Defenses to Data Poisoning? [66.80663779176979]
We present a previously unrecognized threat to robust machine learning models that highlights the importance of training-data quality.
We propose a novel bilevel optimization-based data poisoning attack that degrades the robustness guarantees of certifiably robust classifiers.
Our attack is effective even when the victim trains the models from scratch using state-of-the-art robust training methods.
arXiv Detail & Related papers (2020-12-02T15:30:21Z) - Poison Attacks against Text Datasets with Conditional Adversarially
Regularized Autoencoder [78.01180944665089]
This paper demonstrates a fatal vulnerability in natural language inference (NLI) and text classification systems.
We present a 'backdoor poisoning' attack on NLP models.
arXiv Detail & Related papers (2020-10-06T13:03:49Z) - Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data.
We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label"
We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z) - Model-Targeted Poisoning Attacks with Provable Convergence [19.196295769662186]
In a poisoning attack, an adversary with control over a small fraction of the training data attempts to select that data in a way that induces a corrupted model.
We consider poisoning attacks against convex machine learning models and propose an efficient poisoning attack designed to induce a specified model.
arXiv Detail & Related papers (2020-06-30T01:56:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.