Related papers: Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy

URL: http://arxiv.org/abs/2307.07328v1
Date: Fri, 14 Jul 2023 13:12:21 GMT
Title: Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy
Authors: Zihao Zhu, Mingda Zhang, Shaokui Wei, Li Shen, Yanbo Fan, Baoyuan Wu
Abstract summary: Data-poisoning based backdoor attacks aim to insert backdoor into models by manipulating training datasets without controlling the training process of the target model. We propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization. Experiments on benchmark datasets demonstrate the effectiveness and efficiency of our approach in boosting backdoor attack performance.
Score: 32.5734144242128
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Data-poisoning based backdoor attacks aim to insert backdoor into models by manipulating training datasets without controlling the training process of the target model. Existing attack methods mainly focus on designing triggers or fusion strategies between triggers and benign samples. However, they often randomly select samples to be poisoned, disregarding the varying importance of each poisoning sample in terms of backdoor injection. A recent selection strategy filters a fixed-size poisoning sample pool by recording forgetting events, but it fails to consider the remaining samples outside the pool from a global perspective. Moreover, computing forgetting events requires significant additional computing resources. Therefore, how to efficiently and effectively select poisoning samples from the entire dataset is an urgent problem in backdoor attacks.To address it, firstly, we introduce a poisoning mask into the regular backdoor training loss. We suppose that a backdoored model training with hard poisoning samples has a more backdoor effect on easy ones, which can be implemented by hindering the normal training process (\ie, maximizing loss \wrt mask). To further integrate it with normal training process, we then propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization.Specifically, the outer loop aims to achieve the backdoor attack goal by minimizing the loss based on the selected samples, while the inner loop selects hard poisoning samples that impede this goal by maximizing the loss. After several rounds of adversarial training, we finally select effective poisoning samples with high contribution. Extensive experiments on benchmark datasets demonstrate the effectiveness and efficiency of our approach in boosting backdoor attack performance.

Related papers

Non-omniscient backdoor injection with a single poison sample: Proving the one-poison hypothesis for linear regression and linear classification [6.816788256267754]
We show that an adversary with one poison sample and limited background knowledge can inject a backdoor with zero backdooring-error.<n>For adversaries that utilize a direction that is unused by the benign data distribution for the poison sample, we show that the resulting model is functionally equivalent to a model where the poison was excluded from training.
arXiv Detail & Related papers (2025-08-07T17:41:33Z)
Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets. We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO) Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z)
Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning. This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities. In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z)
Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks [11.390175856652856]
Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data. We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate. Our threat model poses a serious threat in training machine learning models with third-party datasets.
arXiv Detail & Related papers (2024-07-15T15:38:21Z)
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources. Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker. Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z)
The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data [4.9676716806872125]
backdoor attacks have posed a serious security threat to the training process of deep neural networks (DNNs) We propose a novel dual-network training framework: The Victim and The Beneficiary (V&B), which exploits a poisoned model to train a clean model without extra benign samples. Our framework is effective in preventing backdoor injection and robust to various attacks while maintaining the performance on benign samples.
arXiv Detail & Related papers (2024-04-17T11:15:58Z)
Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data. We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z)
Setting the Trap: Capturing and Defeating Backdoors in Pretrained Language Models through Honeypots [68.84056762301329]
Recent research has exposed the susceptibility of pretrained language models (PLMs) to backdoor attacks. We propose and integrate a honeypot module into the original PLM to absorb backdoor information exclusively. Our design is motivated by the observation that lower-layer representations in PLMs carry sufficient backdoor features.
arXiv Detail & Related papers (2023-10-28T08:21:16Z)
Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks [10.817607451423765]
In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective. We adopt the forgetting events of the samples to indicate the contribution of different poisoned samples and use the curvature of the loss surface to analyses the effectiveness of this phenomenon.
arXiv Detail & Related papers (2023-10-15T05:55:23Z)
Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios. Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples. We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z)
Defending Against Backdoor Attacks by Layer-wise Feature Analysis [11.465401472704732]
Training deep neural networks (DNNs) usually requires massive training data and computational resources. A new training-time attack (i.e., backdoor attack) aims to induce misclassification of input samples containing adversary-specified trigger patterns. We propose a simple yet effective method to filter poisoned samples by analyzing the feature differences between suspicious and benign samples at the critical layer.
arXiv Detail & Related papers (2023-02-24T17:16:37Z)
Data-Efficient Backdoor Attacks [14.230326737098554]
Deep neural networks are vulnerable to backdoor attacks. In this paper, we formulate improving the poisoned data efficiency by the selection. The same attack success rate can be achieved with only 47% to 75% of the poisoned sample volume.
arXiv Detail & Related papers (2022-04-22T09:52:22Z)
Black-box Detection of Backdoor Attacks with Limited Information and Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model. In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.