Systematic Evaluation of Backdoor Data Poisoning Attacks on Image
Classifiers
- URL: http://arxiv.org/abs/2004.11514v1
- Date: Fri, 24 Apr 2020 02:58:22 GMT
- Title: Systematic Evaluation of Backdoor Data Poisoning Attacks on Image
Classifiers
- Authors: Loc Truong, Chace Jones, Brian Hutchinson, Andrew August, Brenda
Praggastis, Robert Jasper, Nicole Nichols, Aaron Tuor
- Abstract summary: Backdoor data poisoning attacks have been demonstrated in computer vision research as a potential safety risk for machine learning (ML) systems.
Our work builds upon prior backdoor data-poisoning research for ML image classifiers.
We find that poisoned models are hard to detect through performance inspection alone.
- Score: 6.352532169433872
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor data poisoning attacks have recently been demonstrated in computer
vision research as a potential safety risk for machine learning (ML) systems.
Traditional data poisoning attacks manipulate training data to induce
unreliability of an ML model, whereas backdoor data poisoning attacks maintain
system performance unless the ML model is presented with an input containing an
embedded "trigger" that provides a predetermined response advantageous to the
adversary. Our work builds upon prior backdoor data-poisoning research for ML
image classifiers and systematically assesses different experimental conditions
including types of trigger patterns, persistence of trigger patterns during
retraining, poisoning strategies, architectures (ResNet-50, NasNet,
NasNet-Mobile), datasets (Flowers, CIFAR-10), and potential defensive
regularization techniques (Contrastive Loss, Logit Squeezing, Manifold Mixup,
Soft-Nearest-Neighbors Loss). Experiments yield four key findings. First, the
success rate of backdoor poisoning attacks varies widely, depending on several
factors, including model architecture, trigger pattern and regularization
technique. Second, we find that poisoned models are hard to detect through
performance inspection alone. Third, regularization typically reduces backdoor
success rate, although it can have no effect or even slightly increase it,
depending on the form of regularization. Finally, backdoors inserted through
data poisoning can be rendered ineffective after just a few epochs of
additional training on a small set of clean data without affecting the model's
performance.
Related papers
- Erasing Self-Supervised Learning Backdoor by Cluster Activation Masking [65.44477004525231]
Researchers have recently found that Self-Supervised Learning (SSL) is vulnerable to backdoor attacks.
In this paper, we propose to erase the SSL backdoor by cluster activation masking and propose a novel PoisonCAM method.
Our method achieves 96% accuracy for backdoor trigger detection compared to 3% of the state-of-the-art method on poisoned ImageNet-100.
arXiv Detail & Related papers (2023-12-13T08:01:15Z) - FreqFed: A Frequency Analysis-Based Approach for Mitigating Poisoning
Attacks in Federated Learning [98.43475653490219]
Federated learning (FL) is susceptible to poisoning attacks.
FreqFed is a novel aggregation mechanism that transforms the model updates into the frequency domain.
We demonstrate that FreqFed can mitigate poisoning attacks effectively with a negligible impact on the utility of the aggregated model.
arXiv Detail & Related papers (2023-12-07T16:56:24Z) - Hiding Backdoors within Event Sequence Data via Poisoning Attacks [2.532893215351299]
In computer vision, one can shape the output during inference by performing an adversarial attack called poisoning.
For sequences of financial transactions of a customer, insertion of a backdoor is harder to perform.
We replace a clean model with a poisoned one that is aware of the availability of a backdoor and utilize this knowledge.
arXiv Detail & Related papers (2023-08-20T08:27:42Z) - Exploring Model Dynamics for Accumulative Poisoning Discovery [62.08553134316483]
We propose a novel information measure, namely, Memorization Discrepancy, to explore the defense via the model-level information.
By implicitly transferring the changes in the data manipulation to that in the model outputs, Memorization Discrepancy can discover the imperceptible poison samples.
We thoroughly explore its properties and propose Discrepancy-aware Sample Correction (DSC) to defend against accumulative poisoning attacks.
arXiv Detail & Related papers (2023-06-06T14:45:24Z) - Backdoor Defense via Deconfounded Representation Learning [17.28760299048368]
We propose a Causality-inspired Backdoor Defense (CBD) to learn deconfounded representations for reliable classification.
CBD is effective in reducing backdoor threats while maintaining high accuracy in predicting benign samples.
arXiv Detail & Related papers (2023-03-13T02:25:59Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain [8.64369418938889]
We propose a generalized backdoor attack method based on the frequency domain.
It can implement backdoor implantation without mislabeling and accessing the training process.
We evaluate our approach in the no-label and clean-label cases on three datasets.
arXiv Detail & Related papers (2022-07-09T07:05:53Z) - Accumulative Poisoning Attacks on Real-time Data [56.96241557830253]
We show that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
Our work validates that a well-designed but straightforward attacking strategy can dramatically amplify the poisoning effects.
arXiv Detail & Related papers (2021-06-18T08:29:53Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z) - TOP: Backdoor Detection in Neural Networks via Transferability of
Perturbation [1.52292571922932]
Detection of backdoors in trained models without access to the training data or example triggers is an important open problem.
In this paper, we identify an interesting property of these models: adversarial perturbations transfer from image to image more readily in poisoned models than in clean models.
We use this feature to detect poisoned models in the TrojAI benchmark, as well as additional models.
arXiv Detail & Related papers (2021-03-18T14:13:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.