Demystifying Poisoning Backdoor Attacks from a Statistical Perspective
- URL: http://arxiv.org/abs/2310.10780v2
- Date: Wed, 18 Oct 2023 01:49:57 GMT
- Title: Demystifying Poisoning Backdoor Attacks from a Statistical Perspective
- Authors: Ganghua Wang, Xun Xian, Jayanth Srinivasa, Ashish Kundu, Xuan Bi,
Mingyi Hong, Jie Ding
- Abstract summary: Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious consequences.
This paper evaluates the effectiveness of any backdoor attack incorporating a constant trigger.
Our derived understanding applies to both discriminative and generative models.
- Score: 35.30533879618651
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The growing dependence on machine learning in real-world applications
emphasizes the importance of understanding and ensuring its safety. Backdoor
attacks pose a significant security risk due to their stealthy nature and
potentially serious consequences. Such attacks involve embedding triggers
within a learning model with the intention of causing malicious behavior when
an active trigger is present while maintaining regular functionality without
it. This paper evaluates the effectiveness of any backdoor attack incorporating
a constant trigger, by establishing tight lower and upper boundaries for the
performance of the compromised model on both clean and backdoor test data. The
developed theory answers a series of fundamental but previously underexplored
problems, including (1) what are the determining factors for a backdoor
attack's success, (2) what is the direction of the most effective backdoor
attack, and (3) when will a human-imperceptible trigger succeed. Our derived
understanding applies to both discriminative and generative models. We also
demonstrate the theory by conducting experiments using benchmark datasets and
state-of-the-art backdoor attack scenarios.
Related papers
- A Backdoor Attack Scheme with Invisible Triggers Based on Model Architecture Modification [12.393139669821869]
Traditional backdoor attacks involve injecting malicious samples with specific triggers into the training data.
More sophisticated attacks modify the model's architecture directly.
A novel backdoor attack method is presented in the paper.
It embeds the backdoor within the model's architecture and has the capability to generate inconspicuous and stealthy triggers.
arXiv Detail & Related papers (2024-12-22T07:39:43Z) - Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets.
We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO)
Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z) - DeCE: Deceptive Cross-Entropy Loss Designed for Defending Backdoor Attacks [26.24490960002264]
We propose a general and effective loss function DeCE (Deceptive Cross-Entropy) to enhance the security of Code Language Models.
Our experiments across various code synthesis datasets, models, and poisoning ratios demonstrate the applicability and effectiveness of DeCE.
arXiv Detail & Related papers (2024-07-12T03:18:38Z) - BEEAR: Embedding-based Adversarial Removal of Safety Backdoors in Instruction-tuned Language Models [57.5404308854535]
Safety backdoor attacks in large language models (LLMs) enable the stealthy triggering of unsafe behaviors while evading detection during normal interactions.
We present BEEAR, a mitigation approach leveraging the insight that backdoor triggers induce relatively uniform drifts in the model's embedding space.
Our bi-level optimization method identifies universal embedding perturbations that elicit unwanted behaviors and adjusts the model parameters to reinforce safe behaviors against these perturbations.
arXiv Detail & Related papers (2024-06-24T19:29:47Z) - Backdoor Attack against One-Class Sequential Anomaly Detection Models [10.020488631167204]
We explore compromising deep sequential anomaly detection models by proposing a novel backdoor attack strategy.
The attack approach comprises two primary steps, trigger generation and backdoor injection.
Experiments demonstrate the effectiveness of our proposed attack strategy by injecting backdoors on two well-established one-class anomaly detection models.
arXiv Detail & Related papers (2024-02-15T19:19:54Z) - Rethinking Backdoor Attacks [122.1008188058615]
In a backdoor attack, an adversary inserts maliciously constructed backdoor examples into a training set to make the resulting model vulnerable to manipulation.
Defending against such attacks typically involves viewing these inserted examples as outliers in the training set and using techniques from robust statistics to detect and remove them.
We show that without structural information about the training data distribution, backdoor attacks are indistinguishable from naturally-occurring features in the data.
arXiv Detail & Related papers (2023-07-19T17:44:54Z) - Backdoor Attacks Against Incremental Learners: An Empirical Evaluation
Study [79.33449311057088]
This paper empirically reveals the high vulnerability of 11 typical incremental learners against poisoning-based backdoor attack on 3 learning scenarios.
The defense mechanism based on activation clustering is found to be effective in detecting our trigger pattern to mitigate potential security risks.
arXiv Detail & Related papers (2023-05-28T09:17:48Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence Functions [23.750285504961337]
We study the process of backdoor learning under the lens of incremental learning and influence functions.
We show that the effectiveness of backdoor attacks depends on: (i) the complexity of the learning algorithm; (ii) the fraction of backdoor samples injected into the training set; and (iii) the size and visibility of the backdoor trigger.
arXiv Detail & Related papers (2021-06-14T08:00:48Z) - Backdoor Learning: A Survey [75.59571756777342]
Backdoor attack intends to embed hidden backdoor into deep neural networks (DNNs)
Backdoor learning is an emerging and rapidly growing research area.
This paper presents the first comprehensive survey of this realm.
arXiv Detail & Related papers (2020-07-17T04:09:20Z) - Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural
Networks [25.23881974235643]
We show that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as textitbackdoor smoothing.
Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks.
arXiv Detail & Related papers (2020-06-11T18:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.