RSBA: Robust Statistical Backdoor Attack under Privilege-Constrained
Scenarios
- URL: http://arxiv.org/abs/2304.10985v2
- Date: Mon, 11 Mar 2024 17:14:40 GMT
- Title: RSBA: Robust Statistical Backdoor Attack under Privilege-Constrained
Scenarios
- Authors: Xiaolei Liu, Ming Yi, Kangyi Ding, Bangzhou Xin, Yixiao Xu, Li Yan,
Chao Shen
- Abstract summary: Learning-based systems have been demonstrated to be vulnerable to backdoor attacks.
In this paper, we introduce RSBA (Robust Statistical Backdoor Attack under Privilege-constrained scenarios)
We empirically and theoretically demonstrate the robustness of RSBA against image augmentations and model distillation.
- Score: 9.38518049643553
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning-based systems have been demonstrated to be vulnerable to backdoor
attacks, wherein malicious users manipulate model performance by injecting
backdoors into the target model and activating them with specific triggers.
Previous backdoor attack methods primarily focused on two key metrics: attack
success rate and stealthiness. However, these methods often necessitate
significant privileges over the target model, such as control over the training
process, making them challenging to implement in real-world scenarios.
Moreover, the robustness of existing backdoor attacks is not guaranteed, as
they prove sensitive to defenses such as image augmentations and model
distillation. In this paper, we address these two limitations and introduce
RSBA (Robust Statistical Backdoor Attack under Privilege-constrained
Scenarios). The key insight of RSBA is that statistical features can naturally
divide images into different groups, offering a potential implementation of
triggers. This type of trigger is more robust than manually designed ones, as
it is widely distributed in normal images. By leveraging these statistical
triggers, RSBA enables attackers to conduct black-box attacks by solely
poisoning the labels or the images. We empirically and theoretically
demonstrate the robustness of RSBA against image augmentations and model
distillation. Experimental results show that RSBA achieves a 99.83\% attack
success rate in black-box scenarios. Remarkably, it maintains a high success
rate even after model distillation, where attackers lack access to the training
dataset of the student model (1.39\% success rate for baseline methods on
average).
Related papers
- InverTune: Removing Backdoors from Multimodal Contrastive Learning Models via Trigger Inversion and Activation Tuning [36.56302680556252]
We introduce InverTune, the first backdoor defense framework for multimodal models under minimal attacker assumptions.<n>InverTune effectively identifies and removes backdoor artifacts through three key components, achieving robust protection against backdoor attacks.<n> Experimental results show that InverTune reduces the average attack success rate (ASR) by 97.87% against the state-of-the-art (SOTA) attacks.
arXiv Detail & Related papers (2025-06-14T09:08:34Z) - DMGNN: Detecting and Mitigating Backdoor Attacks in Graph Neural Networks [30.766013737094532]
We propose DMGNN against out-of-distribution (OOD) and in-distribution (ID) graph backdoor attacks.
DMGNN can easily identify the hidden ID and OOD triggers via predicting label transitions based on counterfactual explanation.
DMGNN far outperforms the state-of-the-art (SOTA) defense methods, reducing the attack success rate to 5% with almost negligible degradation in model performance.
arXiv Detail & Related papers (2024-10-18T01:08:03Z) - Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets.
We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO)
Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Mitigating Backdoor Attack by Injecting Proactive Defensive Backdoor [63.84477483795964]
Data-poisoning backdoor attacks are serious security threats to machine learning models.
In this paper, we focus on in-training backdoor defense, aiming to train a clean model even when the dataset may be potentially poisoned.
We propose a novel defense approach called PDB (Proactive Defensive Backdoor)
arXiv Detail & Related papers (2024-05-25T07:52:26Z) - SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method
Perspective [65.70799289211868]
We introduce two new theory-driven trigger pattern generation methods specialized for dataset distillation.
We show that our optimization-based trigger design framework informs effective backdoor attacks on dataset distillation.
arXiv Detail & Related papers (2023-11-28T09:53:05Z) - Backdoor Defense via Deconfounded Representation Learning [17.28760299048368]
We propose a Causality-inspired Backdoor Defense (CBD) to learn deconfounded representations for reliable classification.
CBD is effective in reducing backdoor threats while maintaining high accuracy in predicting benign samples.
arXiv Detail & Related papers (2023-03-13T02:25:59Z) - Invisible Backdoor Attacks Using Data Poisoning in the Frequency Domain [8.64369418938889]
We propose a generalized backdoor attack method based on the frequency domain.
It can implement backdoor implantation without mislabeling and accessing the training process.
We evaluate our approach in the no-label and clean-label cases on three datasets.
arXiv Detail & Related papers (2022-07-09T07:05:53Z) - Model-Contrastive Learning for Backdoor Defense [13.781375023320981]
We propose a novel backdoor defense method named MCL based on model-contrastive learning.
MCL is more effective for reducing backdoor threats while maintaining higher accuracy of benign data.
arXiv Detail & Related papers (2022-05-09T16:36:46Z) - Narcissus: A Practical Clean-Label Backdoor Attack with Limited
Information [22.98039177091884]
"Clean-label" backdoor attacks require knowledge of the entire training set to be effective.
This paper provides an algorithm to mount clean-label backdoor attacks based only on the knowledge of representative examples from the target class.
Our attack works well across datasets and models, even when the trigger presents in the physical world.
arXiv Detail & Related papers (2022-04-11T16:58:04Z) - Black-box Detection of Backdoor Attacks with Limited Information and
Data [56.0735480850555]
We propose a black-box backdoor detection (B3D) method to identify backdoor attacks with only query access to the model.
In addition to backdoor detection, we also propose a simple strategy for reliable predictions using the identified backdoored models.
arXiv Detail & Related papers (2021-03-24T12:06:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.