Boosting Backdoor Attack with A Learnable Poisoning Sample Selection
Strategy
- URL: http://arxiv.org/abs/2307.07328v1
- Date: Fri, 14 Jul 2023 13:12:21 GMT
- Title: Boosting Backdoor Attack with A Learnable Poisoning Sample Selection
Strategy
- Authors: Zihao Zhu, Mingda Zhang, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu
- Abstract summary: Data-poisoning based backdoor attacks aim to insert backdoor into models by manipulating training datasets without controlling the training process of the target model.
We propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization.
Experiments on benchmark datasets demonstrate the effectiveness and efficiency of our approach in boosting backdoor attack performance.
- Score: 32.5734144242128
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-poisoning based backdoor attacks aim to insert backdoor into models by
manipulating training datasets without controlling the training process of the
target model. Existing attack methods mainly focus on designing triggers or
fusion strategies between triggers and benign samples. However, they often
randomly select samples to be poisoned, disregarding the varying importance of
each poisoning sample in terms of backdoor injection. A recent selection
strategy filters a fixed-size poisoning sample pool by recording forgetting
events, but it fails to consider the remaining samples outside the pool from a
global perspective. Moreover, computing forgetting events requires significant
additional computing resources. Therefore, how to efficiently and effectively
select poisoning samples from the entire dataset is an urgent problem in
backdoor attacks.To address it, firstly, we introduce a poisoning mask into the
regular backdoor training loss. We suppose that a backdoored model training
with hard poisoning samples has a more backdoor effect on easy ones, which can
be implemented by hindering the normal training process (\ie, maximizing loss
\wrt mask). To further integrate it with normal training process, we then
propose a learnable poisoning sample selection strategy to learn the mask
together with the model parameters through a min-max optimization.Specifically,
the outer loop aims to achieve the backdoor attack goal by minimizing the loss
based on the selected samples, while the inner loop selects hard poisoning
samples that impede this goal by maximizing the loss. After several rounds of
adversarial training, we finally select effective poisoning samples with high
contribution. Extensive experiments on benchmark datasets demonstrate the
effectiveness and efficiency of our approach in boosting backdoor attack
performance.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks [11.390175856652856]
Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data.
We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate.
Our threat model poses a serious threat in training machine learning models with third-party datasets.
arXiv Detail & Related papers (2024-07-15T15:38:21Z) - SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data [4.9676716806872125]
backdoor attacks have posed a serious security threat to the training process of deep neural networks (DNNs)
We propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples.
Our framework is effective in preventing backdoor injection and robust to various attacks while maintaining the performance on benign samples.
arXiv Detail & Related papers (2024-04-17T11:15:58Z) - Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data.
We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z) - Setting the Trap: Capturing and Defeating Backdoors in Pretrained
Language Models through Honeypots [68.84056762301329]
Recent research has exposed the susceptibility of pretrained language models (PLMs) to backdoor attacks.
We propose and integrate a honeypot module into the original PLM to absorb backdoor information exclusively.
Our design is motivated by the observation that lower-layer representations in PLMs carry sufficient backdoor features.
arXiv Detail & Related papers (2023-10-28T08:21:16Z) - Explore the Effect of Data Selection on Poison Efficiency in Backdoor
Attacks [10.817607451423765]
In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective.
We adopt the forgetting events of the samples to indicate the contribution of different poisoned samples and use the curvature of the loss surface to analyses the effectiveness of this phenomenon.
arXiv Detail & Related papers (2023-10-15T05:55:23Z) - Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios.
Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples.
We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z) - Defending Against Backdoor Attacks by Layer-wise Feature Analysis [11.465401472704732]
Training deep neural networks (DNNs) usually requires massive training data and computational resources.
A new training-time attack (i.e., backdoor attack) aims to induce misclassification of input samples containing adversary-specified trigger patterns.
We propose a simple yet effective method to filter poisoned samples by analyzing the feature differences between suspicious and benign samples at the critical layer.
arXiv Detail & Related papers (2023-02-24T17:16:37Z) - Data-Efficient Backdoor Attacks [14.230326737098554]
Deep neural networks are vulnerable to backdoor attacks.
In this paper, we formulate improving the poisoned data efficiency by the selection.
The same attack success rate can be achieved with only 47% to 75% of the poisoned sample volume.
arXiv Detail & Related papers (2022-04-22T09:52:22Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.