Related papers: A Backdoor Approach with Inverted Labels Using Dirty Label-Flipping Attacks

A Backdoor Approach with Inverted Labels Using Dirty Label-Flipping Attacks

URL: http://arxiv.org/abs/2404.00076v2
Date: Sun, 7 Apr 2024 04:38:37 GMT
Title: A Backdoor Approach with Inverted Labels Using Dirty Label-Flipping Attacks
Authors: Orson Mengara,
Abstract summary: In this study, we propose a backdoor attack named 'DirtyFlipping' It uses dirty label techniques, "label-on-label", to input triggers (clapping) in the selected data patterns associated with the target class, thereby enabling a stealthy backdoor.
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Audio-based machine learning systems frequently use public or third-party data, which might be inaccurate. This exposes deep neural network (DNN) models trained on such data to potential data poisoning attacks. In this type of assault, attackers can train the DNN model using poisoned data, potentially degrading its performance. Another type of data poisoning attack that is extremely relevant to our investigation is label flipping, in which the attacker manipulates the labels for a subset of data. It has been demonstrated that these assaults may drastically reduce system performance, even for attackers with minimal abilities. In this study, we propose a backdoor attack named 'DirtyFlipping', which uses dirty label techniques, "label-on-label", to input triggers (clapping) in the selected data patterns associated with the target class, thereby enabling a stealthy backdoor.

Related papers

Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks [11.390175856652856]
Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data. We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate. Our threat model poses a serious threat in training machine learning models with third-party datasets.
arXiv Detail & Related papers (2024-07-15T15:38:21Z)
Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers [8.15496105932744]
Poisoning-based backdoor attacks expose vulnerabilities in the data preparation stage of deep neural network (DNN) training. We develop a new categorization of triggers inspired by the adversarial technique and develop a multi-label and multi-payload Poisoning-based backdoor attack with Positive Triggers (PPT) Under both dirty- and clean-label settings, we show empirically that the proposed attack achieves a high attack success rate without sacrificing accuracy across various datasets.
arXiv Detail & Related papers (2024-05-09T06:45:11Z)
Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data. We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z)
Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks. backdoor attack is an emerging yet threatening training-phase threat. We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z)
Backdoor Cleansing with Unlabeled Data [70.29989887008209]
externally trained Deep Neural Networks (DNNs) can potentially be backdoor attacked. We propose a novel defense method that does not require training labels. Our method, trained without labels, is on-par with state-of-the-art defense methods trained using labels.
arXiv Detail & Related papers (2022-11-22T06:29:30Z)
BITE: Textual Backdoor Attacks with Iterative Trigger Injection [24.76186072273438]
Backdoor attacks have become an emerging threat to NLP systems. By providing poisoned training data, the adversary can embed a "backdoor" into the victim model. We propose BITE, a backdoor attack that poisons the training data to establish strong correlations between the target label and a set of "trigger words"
arXiv Detail & Related papers (2022-05-25T11:58:38Z)
Narcissus: A Practical Clean-Label Backdoor Attack with Limited Information [22.98039177091884]
"Clean-label" backdoor attacks require knowledge of the entire training set to be effective. This paper provides an algorithm to mount clean-label backdoor attacks based only on the knowledge of representative examples from the target class. Our attack works well across datasets and models, even when the trigger presents in the physical world.
arXiv Detail & Related papers (2022-04-11T16:58:04Z)
Textual Backdoor Attacks Can Be More Harmful via Two Simple Tricks [58.0225587881455]
In this paper, we find two simple tricks that can make existing textual backdoor attacks much more harmful. The first trick is to add an extra training task to distinguish poisoned and clean data during the training of the victim model. The second one is to use all the clean training data rather than remove the original clean data corresponding to the poisoned data.
arXiv Detail & Related papers (2021-10-15T17:58:46Z)
Sleeper Agent: Scalable Hidden Trigger Backdoors for Neural Networks Trained from Scratch [99.90716010490625]
Backdoor attackers tamper with training data to embed a vulnerability in models that are trained on that data. This vulnerability is then activated at inference time by placing a "trigger" into the model's input. We develop a new hidden trigger attack, Sleeper Agent, which employs gradient matching, data selection, and target model re-training during the crafting process.
arXiv Detail & Related papers (2021-06-16T17:09:55Z)
Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data. We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level. Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching [56.280018325419896]
Data Poisoning attacks modify training data to maliciously control a model trained on such data. We analyze a particularly malicious poisoning attack that is both "from scratch" and "clean label" We show that it is the first poisoning method to cause targeted misclassification in modern deep networks trained from scratch on a full-sized, poisoned ImageNet dataset.
arXiv Detail & Related papers (2020-09-04T16:17:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.