Under-confidence Backdoors Are Resilient and Stealthy Backdoors
- URL: http://arxiv.org/abs/2202.11203v2
- Date: Mon, 22 Jul 2024 16:45:24 GMT
- Title: Under-confidence Backdoors Are Resilient and Stealthy Backdoors
- Authors: Minlong Peng, Zidi Xiong, Quang H. Nguyen, Mingming Sun, Khoa D. Doan, Ping Li,
- Abstract summary: backdoor attacks aim to make the victim model produce designed outputs on any input injected with pre-designed backdoors.
In order to achieve a high attack success rate, most existing attack methods change the labels of the poisoned samples to the target class.
This practice often results in severe over-fitting of the victim model over the backdoors, making the attack quite effective in output control but easier to be identified by human inspection or automatic defense algorithms.
- Score: 35.57996363193643
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: By injecting a small number of poisoned samples into the training set, backdoor attacks aim to make the victim model produce designed outputs on any input injected with pre-designed backdoors. In order to achieve a high attack success rate using as few poisoned training samples as possible, most existing attack methods change the labels of the poisoned samples to the target class. This practice often results in severe over-fitting of the victim model over the backdoors, making the attack quite effective in output control but easier to be identified by human inspection or automatic defense algorithms. In this work, we proposed a label-smoothing strategy to overcome the over-fitting problem of these attack methods, obtaining a \textit{Label-Smoothed Backdoor Attack} (LSBA). In the LSBA, the label of the poisoned sample $\bm{x}$ will be changed to the target class with a probability of $p_n(\bm{x})$ instead of 100\%, and the value of $p_n(\bm{x})$ is specifically designed to make the prediction probability the target class be only slightly greater than those of the other classes. Empirical studies on several existing backdoor attacks show that our strategy can considerably improve the stealthiness of these attacks and, at the same time, achieve a high attack success rate. In addition, our strategy makes it able to manually control the prediction probability of the design output through manipulating the applied and activated number of LSBAs\footnote{Source code will be published at \url{https://github.com/v-mipeng/LabelSmoothedAttack.git}}.
Related papers
- FFCBA: Feature-based Full-target Clean-label Backdoor Attacks [10.650796825194337]
Backdoor attacks pose a significant threat to deep neural networks.
Clean-label attacks are more stealthy, as they avoid modifying the labels of poisoned samples.
We propose the Feature-based Full-target Clean-label Backdoor Attacks (FFCBA)
arXiv Detail & Related papers (2025-04-29T05:49:42Z) - ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models [55.93380086403591]
Generative large language models are vulnerable to backdoor attacks.
$textitELBA-Bench$ allows attackers to inject backdoor through parameter efficient fine-tuning.
$textitELBA-Bench$ provides over 1300 experiments.
arXiv Detail & Related papers (2025-02-22T12:55:28Z) - PBP: Post-training Backdoor Purification for Malware Classifiers [5.112004957241861]
In recent years, the rise of machine learning (ML) in cybersecurity has brought new challenges, including the increasing threat of backdoor poisoning attacks.
Here, we introduce PBP, a post-training defense for malware classifiers that mitigates various types of backdoor embeddings without assuming any specific backdoor embedding mechanism.
Our method demonstrates substantial advantages over several state-of-the-art methods, as evidenced by experiments on two datasets, two types of backdoor methods, and various attack configurations.
arXiv Detail & Related papers (2024-12-04T16:30:03Z) - NoiseAttack: An Evasive Sample-Specific Multi-Targeted Backdoor Attack Through White Gaussian Noise [0.19820694575112383]
Backdoor attacks pose a significant threat when using third-party data for deep learning development.
We introduce a novel sample-specific multi-targeted backdoor attack, namely NoiseAttack.
This work is the first of its kind to launch a vision backdoor attack with the intent to generate multiple targeted classes.
arXiv Detail & Related papers (2024-09-03T19:24:46Z) - Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks [11.390175856652856]
Clean-label attacks are a more stealthy form of backdoor attacks that can perform the attack without changing the labels of poisoned data.
We study different strategies for selectively poisoning a small set of training samples in the target class to boost the attack success rate.
Our threat model poses a serious threat in training machine learning models with third-party datasets.
arXiv Detail & Related papers (2024-07-15T15:38:21Z) - SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - Can We Trust the Unlabeled Target Data? Towards Backdoor Attack and Defense on Model Adaptation [120.42853706967188]
We explore the potential backdoor attacks on model adaptation launched by well-designed poisoning target data.
We propose a plug-and-play method named MixAdapt, combining it with existing adaptation algorithms.
arXiv Detail & Related papers (2024-01-11T16:42:10Z) - Backdoor Attack with Sparse and Invisible Trigger [57.41876708712008]
Deep neural networks (DNNs) are vulnerable to backdoor attacks.
backdoor attack is an emerging yet threatening training-phase threat.
We propose a sparse and invisible backdoor attack (SIBA)
arXiv Detail & Related papers (2023-05-11T10:05:57Z) - Versatile Weight Attack via Flipping Limited Bits [68.45224286690932]
We study a novel attack paradigm, which modifies model parameters in the deployment stage.
Considering the effectiveness and stealthiness goals, we provide a general formulation to perform the bit-flip based weight attack.
We present two cases of the general formulation with different malicious purposes, i.e., single sample attack (SSA) and triggered samples attack (TSA)
arXiv Detail & Related papers (2022-07-25T03:24:58Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z) - Hidden Backdoor Attack against Semantic Segmentation Models [60.0327238844584]
The emphbackdoor attack intends to embed hidden backdoors in deep neural networks (DNNs) by poisoning training data.
We propose a novel attack paradigm, the emphfine-grained attack, where we treat the target label from the object-level instead of the image-level.
Experiments show that the proposed methods can successfully attack semantic segmentation models by poisoning only a small proportion of training data.
arXiv Detail & Related papers (2021-03-06T05:50:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.