Backdoor Attacks Against Incremental Learners: An Empirical Evaluation
Study
- URL: http://arxiv.org/abs/2305.18384v1
- Date: Sun, 28 May 2023 09:17:48 GMT
- Title: Backdoor Attacks Against Incremental Learners: An Empirical Evaluation
Study
- Authors: Yiqi Zhong, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji
- Abstract summary: This paper empirically reveals the high vulnerability of 11 typical incremental learners against poisoning-based backdoor attack on 3 learning scenarios.
The defense mechanism based on activation clustering is found to be effective in detecting our trigger pattern to mitigate potential security risks.
- Score: 79.33449311057088
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large amounts of incremental learning algorithms have been proposed to
alleviate the catastrophic forgetting issue arises while dealing with
sequential data on a time series. However, the adversarial robustness of
incremental learners has not been widely verified, leaving potential security
risks. Specifically, for poisoning-based backdoor attacks, we argue that the
nature of streaming data in IL provides great convenience to the adversary by
creating the possibility of distributed and cross-task attacks -- an adversary
can affect \textbf{any unknown} previous or subsequent task by data poisoning
\textbf{at any time or time series} with extremely small amount of backdoor
samples injected (e.g., $0.1\%$ based on our observations). To attract the
attention of the research community, in this paper, we empirically reveal the
high vulnerability of 11 typical incremental learners against poisoning-based
backdoor attack on 3 learning scenarios, especially the cross-task
generalization effect of backdoor knowledge, while the poison ratios range from
$5\%$ to as low as $0.1\%$. Finally, the defense mechanism based on activation
clustering is found to be effective in detecting our trigger pattern to
mitigate potential security risks.
Related papers
- Efficient Backdoor Defense in Multimodal Contrastive Learning: A Token-Level Unlearning Method for Mitigating Threats [52.94388672185062]
We propose an efficient defense mechanism against backdoor threats using a concept known as machine unlearning.
This entails strategically creating a small set of poisoned samples to aid the model's rapid unlearning of backdoor vulnerabilities.
In the backdoor unlearning process, we present a novel token-based portion unlearning training regime.
arXiv Detail & Related papers (2024-09-29T02:55:38Z) - Backdoor Attack in Prompt-Based Continual Learning [27.765647731440723]
In this paper, we expose continual learning to a potential threat: backdoor attacks.
We highlight three critical challenges in executing backdoor attacks on incremental learners and propose corresponding solutions.
Our framework achieves up to $100%$ attack success rate, with further ablation studies confirming our contributions.
arXiv Detail & Related papers (2024-06-28T08:53:33Z) - Revisiting Backdoor Attacks against Large Vision-Language Models [76.42014292255944]
This paper empirically examines the generalizability of backdoor attacks during the instruction tuning of LVLMs.
We modify existing backdoor attacks based on the above key observations.
This paper underscores that even simple traditional backdoor strategies pose a serious threat to LVLMs.
arXiv Detail & Related papers (2024-06-27T02:31:03Z) - SEEP: Training Dynamics Grounds Latent Representation Search for Mitigating Backdoor Poisoning Attacks [53.28390057407576]
Modern NLP models are often trained on public datasets drawn from diverse sources.
Data poisoning attacks can manipulate the model's behavior in ways engineered by the attacker.
Several strategies have been proposed to mitigate the risks associated with backdoor attacks.
arXiv Detail & Related papers (2024-05-19T14:50:09Z) - Unlearning Backdoor Threats: Enhancing Backdoor Defense in Multimodal Contrastive Learning via Local Token Unlearning [49.242828934501986]
Multimodal contrastive learning has emerged as a powerful paradigm for building high-quality features.
backdoor attacks subtly embed malicious behaviors within the model during training.
We introduce an innovative token-based localized forgetting training regime.
arXiv Detail & Related papers (2024-03-24T18:33:15Z) - Demystifying Poisoning Backdoor Attacks from a Statistical Perspective [35.30533879618651]
Backdoor attacks pose a significant security risk due to their stealthy nature and potentially serious consequences.
This paper evaluates the effectiveness of any backdoor attack incorporating a constant trigger.
Our derived understanding applies to both discriminative and generative models.
arXiv Detail & Related papers (2023-10-16T19:35:01Z) - Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models [53.416234157608]
We investigate security concerns of the emergent instruction tuning paradigm, that models are trained on crowdsourced datasets with task instructions to achieve superior performance.
Our studies demonstrate that an attacker can inject backdoors by issuing very few malicious instructions and control model behavior through data poisoning.
arXiv Detail & Related papers (2023-05-24T04:27:21Z) - Adversarial Targeted Forgetting in Regularization and Generative Based
Continual Learning Models [2.8021833233819486]
Continual (or "incremental") learning approaches are employed when additional knowledge or tasks need to be learned from subsequent batches or from streaming data.
We show that an intelligent adversary can take advantage of a continual learning algorithm's capabilities of retaining existing knowledge over time.
We show that the adversary can create a "false memory" about any task by inserting carefully-designed backdoor samples to the test instances of that task.
arXiv Detail & Related papers (2021-02-16T18:45:01Z) - Backdoor Smoothing: Demystifying Backdoor Attacks on Deep Neural
Networks [25.23881974235643]
We show that backdoor attacks induce a smoother decision function around the triggered samples -- a phenomenon which we refer to as textitbackdoor smoothing.
Our experiments show that smoothness increases when the trigger is added to the input samples, and that this phenomenon is more pronounced for more successful attacks.
arXiv Detail & Related papers (2020-06-11T18:28:54Z) - Targeted Forgetting and False Memory Formation in Continual Learners
through Adversarial Backdoor Attacks [2.830541450812474]
We explore the vulnerability of Elastic Weight Consolidation (EWC), a popular continual learning algorithm for avoiding catastrophic forgetting.
We show that an intelligent adversary can bypass the EWC's defenses, and instead cause gradual and deliberate forgetting by introducing small amounts of misinformation to the model during training.
We demonstrate such an adversary's ability to assume control of the model via injection of "backdoor" attack samples on both permuted and split benchmark variants of the MNIST dataset.
arXiv Detail & Related papers (2020-02-17T18:13:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.