Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence
Functions
- URL: http://arxiv.org/abs/2106.07214v1
- Date: Mon, 14 Jun 2021 08:00:48 GMT
- Title: Backdoor Learning Curves: Explaining Backdoor Poisoning Beyond Influence
Functions
- Authors: Antonio Emanuele Cin\`a, Kathrin Grosse, Sebastiano Vascon, Ambra
Demontis, Battista Biggio, Fabio Roli, Marcello Pelillo
- Abstract summary: We study the process of backdoor learning under the lens of incremental learning and influence functions.
We show that the success of backdoor attacks inherently depends on (i) the complexity of the learning algorithm and (ii) the fraction of backdoor samples injected into the training set.
- Score: 26.143147923356626
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Backdoor attacks inject poisoning samples during training, with the goal of
enforcing a machine-learning model to output an attacker-chosen class when
presented a specific trigger at test time. Although backdoor attacks have been
demonstrated in a variety of settings and against different models, the factors
affecting their success are not yet well understood. In this work, we provide a
unifying framework to study the process of backdoor learning under the lens of
incremental learning and influence functions. We show that the success of
backdoor attacks inherently depends on (i) the complexity of the learning
algorithm, controlled by its hyperparameters, and (ii) the fraction of backdoor
samples injected into the training set. These factors affect how fast a
machine-learning model learns to correlate the presence of a backdoor trigger
with the target class. Interestingly, our analysis shows that there exists a
region in the hyperparameter space in which the accuracy on clean test samples
is still high while backdoor attacks become ineffective, thereby suggesting
novel criteria to improve existing defenses.
Related papers
- Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets.
We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO)
Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z) - Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Backdoor Defense through Self-Supervised and Generative Learning [0.0]
Training on such data injects a backdoor which causes malicious inference in selected test samples.
This paper explores an approach based on generative modelling of per-class distributions in a self-supervised representation space.
In both cases, we find that per-class generative models allow to detect poisoned data and cleanse the dataset.
arXiv Detail & Related papers (2024-09-02T11:40:01Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Demystifying Poisoning Backdoor Attacks from a Statistical Perspective [35.30533879618651]
Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious consequences.
This paper evaluates the effectiveness of any backdoor attack incorporating a constant trigger.
Our derived understanding applies to both discriminative and generative models.
arXiv Detail & Related papers (2023-10-16T19:35:01Z) - Backdoor Learning on Sequence to Sequence Models [94.23904400441957]
In this paper, we study whether sequence-to-sequence (seq2seq) models are vulnerable to backdoor attacks.
Specifically, we find by only injecting 0.2% samples of the dataset, we can cause the seq2seq model to generate the designated keyword and even the whole sentence.
Extensive experiments on machine translation and text summarization have been conducted to show our proposed methods could achieve over 90% attack success rate on multiple datasets and models.
arXiv Detail & Related papers (2023-05-03T20:31:13Z) - SoK: A Systematic Evaluation of Backdoor Trigger Characteristics in
Image Classification [21.424907311421197]
Deep learning is vulnerable to backdoor attacks that modify the training set to embed a secret functionality in the trained model.
This paper systematically analyzes the most relevant parameters for the backdoor attacks.
Our attacks cover the majority of backdoor settings in research, providing concrete directions for future works.
arXiv Detail & Related papers (2023-02-03T14:00:05Z) - Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics.
We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z) - Backdoor Defense via Suppressing Model Shortcuts [91.30995749139012]
In this paper, we explore the backdoor mechanism from the angle of the model structure.
We demonstrate that the attack success rate (ASR) decreases significantly when reducing the outputs of some key skip connections.
arXiv Detail & Related papers (2022-11-02T15:39:19Z) - Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural
Networks [25.23881974235643]
We show that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as textitbackdoor smoothing.
Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks.
arXiv Detail & Related papers (2020-06-11T18:28:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.