Attack On Prompt: Backdoor Attack in Prompt-Based Continual Learning
        - URL: http://arxiv.org/abs/2406.19753v2
- Date: Tue, 17 Dec 2024 04:41:46 GMT
- Title: Attack On Prompt: Backdoor Attack in Prompt-Based Continual Learning
- Authors: Trang Nguyen, Anh Tran, Nhat Ho, 
- Abstract summary: In this paper, we expose continual learning to a potential threat: backdoor attacks.<n>We highlight three critical challenges in executing backdoor attacks on incremental learners and propose corresponding solutions.<n>Our framework achieves up to $100%$ attack success rate, with further ablation studies confirming our contributions.
- Score: 27.765647731440723
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract:   Prompt-based approaches offer a cutting-edge solution to data privacy issues in continual learning, particularly in scenarios involving multiple data suppliers where long-term storage of private user data is prohibited. Despite delivering state-of-the-art performance, its impressive remembering capability can become a double-edged sword, raising security concerns as it might inadvertently retain poisoned knowledge injected during learning from private user data. Following this insight, in this paper, we expose continual learning to a potential threat: backdoor attack, which drives the model to follow a desired adversarial target whenever a specific trigger is present while still performing normally on clean samples. We highlight three critical challenges in executing backdoor attacks on incremental learners and propose corresponding solutions: (1) \emph{Transferability}: We employ a surrogate dataset and manipulate prompt selection to transfer backdoor knowledge to data from other suppliers; (2) \emph{Resiliency}: We simulate static and dynamic states of the victim to ensure the backdoor trigger remains robust during intense incremental learning processes; and (3) \emph{Authenticity}: We apply binary cross-entropy loss as an anti-cheating factor to prevent the backdoor trigger from devolving into adversarial noise. Extensive experiments across various benchmark datasets and continual learners validate our continual backdoor framework, achieving up to $100\%$ attack success rate, with further ablation studies confirming our contributions' effectiveness. 
 
      
        Related papers
        - Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
 Existing backdoor attacks mainly focus on balanced datasets.
We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO)
Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
 arXiv  Detail & Related papers  (2024-10-16T18:44:22Z)
- A Practical Trigger-Free Backdoor Attack on Neural Networks [33.426207982772226]
 We propose a trigger-free backdoor attack that does not require access to any training data.
Specifically, we design a novel fine-tuning approach that incorporates the concept of malicious data into the concept of the attacker-specified class.
The effectiveness, practicality, and stealthiness of the proposed attack are evaluated on three real-world datasets.
 arXiv  Detail & Related papers  (2024-08-21T08:53:36Z)
- Dullahan: Stealthy Backdoor Attack against Without-Label-Sharing Split   Learning [29.842087372804905]
 We propose a stealthy backdoor attack strategy tailored to the without-label-sharing split learning architecture.
Our SBAT achieves a higher level of attack stealthiness by refraining from modifying any intermediate parameters during training.
 arXiv  Detail & Related papers  (2024-05-21T13:03:06Z)
- Backdoor Attacks Against Incremental Learners: An Empirical Evaluation
  Study [79.33449311057088]
 This paper empirically reveals the high vulnerability of 11 typical incremental learners against poisoning-based backdoor attack on 3 learning scenarios.
The defense mechanism based on activation clustering is found to be effective in detecting our trigger pattern to mitigate potential security risks.
 arXiv  Detail & Related papers  (2023-05-28T09:17:48Z)
- Instructions as Backdoors: Backdoor Vulnerabilities of Instruction   Tuning for Large Language Models [53.416234157608]
 We investigate security concerns of the emergent instruction tuning paradigm, that models are trained on crowdsourced datasets with task instructions to achieve superior performance.
Our studies demonstrate that an attacker can inject backdoors by issuing very few malicious instructions and control model behavior through data poisoning.
 arXiv  Detail & Related papers  (2023-05-24T04:27:21Z)
- On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
 A backdoored model always predicts a target class in the presence of a predefined trigger pattern.
In general, adversarial training is believed to defend against backdoor attacks.
We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
 arXiv  Detail & Related papers  (2022-02-22T02:24:46Z)
- False Memory Formation in Continual Learners Through Imperceptible
  Backdoor Trigger [3.3439097577935213]
 sequentially learning new information presented to a continual (incremental) learning model.
We show that an intelligent adversary can introduce small amount of misinformation to the model during training to cause deliberate forgetting of a specific task or class at test time.
We demonstrate such an adversary's ability to assume control of the model by injecting "backdoor" attack samples to commonly used generative replay and regularization based continual learning approaches.
 arXiv  Detail & Related papers  (2022-02-09T14:21:13Z)
- Where Did You Learn That From? Surprising Effectiveness of Membership
  Inference Attacks Against Temporally Correlated Data in Deep Reinforcement
  Learning [114.9857000195174]
 A major challenge to widespread industrial adoption of deep reinforcement learning is the potential vulnerability to privacy breaches.
We propose an adversarial attack framework tailored for testing the vulnerability of deep reinforcement learning algorithms to membership inference attacks.
 arXiv  Detail & Related papers  (2021-09-08T23:44:57Z)
- Adversarial Targeted Forgetting in Regularization and Generative Based
  Continual Learning Models [2.8021833233819486]
 Continual (or "incremental") learning approaches are employed when additional knowledge or tasks need to be learned from subsequent batches or from streaming data.
We show that an intelligent adversary can take advantage of a continual learning algorithm's capabilities of retaining existing knowledge over time.
We show that the adversary can create a "false memory" about any task by inserting carefully-designed backdoor samples to the test instances of that task.
 arXiv  Detail & Related papers  (2021-02-16T18:45:01Z)
- Curse or Redemption? How Data Heterogeneity Affects the Robustness of
  Federated Learning [51.15273664903583]
 Data heterogeneity has been identified as one of the key features in federated learning but often overlooked in the lens of robustness to adversarial attacks.
This paper focuses on characterizing and understanding its impact on backdooring attacks in federated learning through comprehensive experiments using synthetic and the LEAF benchmarks.
 arXiv  Detail & Related papers  (2021-02-01T06:06:21Z)
- Sampling Attacks: Amplification of Membership Inference Attacks by
  Repeated Queries [74.59376038272661]
 We introduce sampling attack, a novel membership inference technique that unlike other standard membership adversaries is able to work under severe restriction of no access to scores of the victim model.
We show that a victim model that only publishes the labels is still susceptible to sampling attacks and the adversary can recover up to 100% of its performance.
For defense, we choose differential privacy in the form of gradient perturbation during the training of the victim model as well as output perturbation at prediction time.
 arXiv  Detail & Related papers  (2020-09-01T12:54:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
       
     
           This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.