Related papers: A Proxy Attack-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

A Proxy Attack-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks

URL: http://arxiv.org/abs/2306.08313v2
Date: Fri, 26 Apr 2024 02:29:42 GMT
Title: A Proxy Attack-Free Strategy for Practically Improving the Poisoning Efficiency in Backdoor Attacks
Authors: Ziqiang Li, Hong Sun, Pengfei Xia, Beihao Xia, Xue Rui, Wei Zhang, Qinglang Guo, Bin Li,
Abstract summary: Poisoning efficiency plays a critical role in poisoning-based backdoor attacks. To evade detection, attackers aim to use the fewest poisoning samples while achieving the desired attack strength. Recently, selecting efficient samples has shown promise, but it often requires a proxy backdoor injection task to identify an efficient poisoning sample set. This paper presents a Proxy attack-Free Strategy (PFS) designed to identify efficient poisoning samples based on individual similarity and ensemble diversity.
Score: 11.987259487036875
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Poisoning efficiency plays a critical role in poisoning-based backdoor attacks. To evade detection, attackers aim to use the fewest poisoning samples while achieving the desired attack strength. Although efficient triggers have significantly improved poisoning efficiency, there is still room for further enhancement. Recently, selecting efficient samples has shown promise, but it often requires a proxy backdoor injection task to identify an efficient poisoning sample set. However, the proxy attack-based approach can lead to performance degradation if the proxy attack settings differ from those used by the actual victims due to the shortcut of backdoor learning. This paper presents a Proxy attack-Free Strategy (PFS) designed to identify efficient poisoning samples based on individual similarity and ensemble diversity, effectively addressing the mentioned concern. The proposed PFS is motivated by the observation that selecting the to-be-poisoned samples with high similarity between clean samples and their corresponding poisoning samples results in significantly higher attack success rates compared to using samples with low similarity. Furthermore, theoretical analyses for this phenomenon are provided based on the theory of active learning and neural tangent kernel. We comprehensively evaluate the proposed strategy across various datasets, triggers, poisoning rates, architectures, and training hyperparameters. Our experimental results demonstrate that PFS enhances backdoor attack efficiency, while also exhibiting a remarkable speed advantage over prior proxy-dependent selection methodologies.

Related papers

Unlearnable Examples Detection via Iterative Filtering [84.59070204221366]
Deep neural networks are proven to be vulnerable to data poisoning attacks. It is quite beneficial and challenging to detect poisoned samples from a mixed dataset. We propose an Iterative Filtering approach for UEs identification.
arXiv Detail & Related papers (2024-08-15T13:26:13Z)
SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources. Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker. Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z)
Defending Against Weight-Poisoning Backdoor Attacks for Parameter-Efficient Fine-Tuning [57.50274256088251]
We show that parameter-efficient fine-tuning (PEFT) is more susceptible to weight-poisoning backdoor attacks. We develop a Poisoned Sample Identification Module (PSIM) leveraging PEFT, which identifies poisoned samples through confidence. We conduct experiments on text classification tasks, five fine-tuning strategies, and three weight-poisoning backdoor attack methods.
arXiv Detail & Related papers (2024-02-19T14:22:54Z)
Protecting Model Adaptation from Trojans in the Unlabeled Data [120.42853706967188]
This paper explores the potential trojan attacks on model adaptation launched by well-designed poisoning target data. We propose a plug-and-play method named DiffAdapt, which can be seamlessly integrated with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z)
Explore the Effect of Data Selection on Poison Efficiency in Backdoor Attacks [10.817607451423765]
In this study, we focus on improving the poisoning efficiency of backdoor attacks from the sample selection perspective. We adopt the forgetting events of the samples to indicate the contribution of different poisoned samples and use the curvature of the loss surface to analyses the effectiveness of this phenomenon.
arXiv Detail & Related papers (2023-10-15T05:55:23Z)
Confidence-driven Sampling for Backdoor Attacks [49.72680157684523]
Backdoor attacks aim to surreptitiously insert malicious triggers into DNN models, granting unauthorized control during testing scenarios. Existing methods lack robustness against defense strategies and predominantly focus on enhancing trigger stealthiness while randomly selecting poisoned samples. We introduce a straightforward yet highly effective sampling methodology that leverages confidence scores. Specifically, it selects samples with lower confidence scores, significantly increasing the challenge for defenders in identifying and countering these attacks.
arXiv Detail & Related papers (2023-10-08T18:57:36Z)
Defending Pre-trained Language Models as Few-shot Learners against Backdoor Attacks [72.03945355787776]
We advocate MDP, a lightweight, pluggable, and effective defense for PLMs as few-shot learners. We show analytically that MDP creates an interesting dilemma for the attacker to choose between attack effectiveness and detection evasiveness.
arXiv Detail & Related papers (2023-09-23T04:41:55Z)
Boosting Backdoor Attack with A Learnable Poisoning Sample Selection Strategy [32.5734144242128]
Data-poisoning based backdoor attacks aim to insert backdoor into models by manipulating training datasets without controlling the training process of the target model. We propose a learnable poisoning sample selection strategy to learn the mask together with the model parameters through a min-max optimization. Experiments on benchmark datasets demonstrate the effectiveness and efficiency of our approach in boosting backdoor attack performance.
arXiv Detail & Related papers (2023-07-14T13:12:21Z)
On Practical Aspects of Aggregation Defenses against Data Poisoning Attacks [58.718697580177356]
Attacks on deep learning models with malicious training samples are known as data poisoning. Recent advances in defense strategies against data poisoning have highlighted the effectiveness of aggregation schemes in achieving certified poisoning robustness. Here we focus on Deep Partition Aggregation, a representative aggregation defense, and assess its practical aspects, including efficiency, performance, and robustness.
arXiv Detail & Related papers (2023-06-28T17:59:35Z)
Sharpness-Aware Data Poisoning Attack [38.01535347191942]
Recent research has highlighted the vulnerability of Deep Neural Networks (DNNs) against data poisoning attacks. We propose a novel attack method called ''Sharpness-Aware Data Poisoning Attack (SAPA)'' In particular, it leverages the concept of DNNs' loss landscape sharpness to optimize the poisoning effect on the worst re-trained model.
arXiv Detail & Related papers (2023-05-24T08:00:21Z)
Data-Efficient Backdoor Attacks [14.230326737098554]
Deep neural networks are vulnerable to backdoor attacks. In this paper, we formulate improving the poisoned data efficiency by the selection. The same attack success rate can be achieved with only 47% to 75% of the poisoned sample volume.
arXiv Detail & Related papers (2022-04-22T09:52:22Z)
DeepPoison: Feature Transfer Based Stealthy Poisoning Attack [2.1445455835823624]
DeepPoison is a novel adversarial network of one generator and two discriminators. DeepPoison can achieve a state-of-the-art attack success rate, as high as 91.74%.
arXiv Detail & Related papers (2021-01-06T15:45:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.