Backdoor Attack against NLP models with Robustness-Aware Perturbation
defense
- URL: http://arxiv.org/abs/2204.05758v1
- Date: Fri, 8 Apr 2022 10:08:07 GMT
- Title: Backdoor Attack against NLP models with Robustness-Aware Perturbation
defense
- Authors: Shaik Mohammed Maqsood, Viveros Manuela Ceron, Addluri GowthamKrishna
- Abstract summary: Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs)
In our work, we break this defense by controlling the robustness gap between poisoned and clean samples using adversarial training step.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attack intends to embed hidden backdoor into deep neural networks
(DNNs), such that the attacked model performs well on benign samples, whereas
its prediction will be maliciously changed if the hidden backdoor is activated
by the attacker defined trigger. This threat could happen when the training
process is not fully controlled, such as training on third-party data-sets or
adopting third-party models. There has been a lot of research and different
methods to defend such type of backdoor attacks, one being robustness-aware
perturbation-based defense method. This method mainly exploits big gap of
robustness between poisoned and clean samples. In our work, we break this
defense by controlling the robustness gap between poisoned and clean samples
using adversarial training step.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Towards Unified Robustness Against Both Backdoor and Adversarial Attacks [31.846262387360767]
Deep Neural Networks (DNNs) are known to be vulnerable to both backdoor and adversarial attacks.
This paper reveals that there is an intriguing connection between backdoor and adversarial attacks.
A novel Progressive Unified Defense algorithm is proposed to defend against backdoor and adversarial attacks simultaneously.
arXiv Detail & Related papers (2024-05-28T07:50:00Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
A backdoored model always predicts a target class in the presence of a predefined trigger pattern.
In general, adversarial training is believed to defend against backdoor attacks.
We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
arXiv Detail & Related papers (2022-02-22T02:24:46Z) - RAP: Robustness-Aware Perturbations for Defending against Backdoor
Attacks on NLP Models [29.71136191379715]
We propose an efficient online defense mechanism based on robustness-aware perturbations.
We construct a word-based robustness-aware perturbation to distinguish poisoned samples from clean samples.
Our method achieves better defending performance and much lower computational costs than existing online defense methods.
arXiv Detail & Related papers (2021-10-15T03:09:26Z) - Rethinking the Trigger of Backdoor Attack [83.98031510668619]
Currently, most of existing backdoor attacks adopted the setting of emphstatic trigger, $i.e.,$ triggers across the training and testing images follow the same appearance and are located in the same area.
We demonstrate that such an attack paradigm is vulnerable when the trigger in testing images is not consistent with the one used for training.
arXiv Detail & Related papers (2020-04-09T17:19:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.