Related papers: Multi-target Backdoor Attacks for Code Pre-trained Models

Multi-target Backdoor Attacks for Code Pre-trained Models

URL: http://arxiv.org/abs/2306.08350v1
Date: Wed, 14 Jun 2023 08:38:51 GMT
Title: Multi-target Backdoor Attacks for Code Pre-trained Models
Authors: Yanzhou Li, Shangqing Liu, Kangjie Chen, Xiaofei Xie, Tianwei Zhang and Yang Liu
Abstract summary: We propose task-agnostic backdoor attacks for code pre-trained models. Our approach can effectively and stealthily attack code-related downstream tasks.
Score: 24.37781284059454
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Backdoor attacks for neural code models have gained considerable attention due to the advancement of code intelligence. However, most existing works insert triggers into task-specific data for code-related downstream tasks, thereby limiting the scope of attacks. Moreover, the majority of attacks for pre-trained models are designed for understanding tasks. In this paper, we propose task-agnostic backdoor attacks for code pre-trained models. Our backdoored model is pre-trained with two learning strategies (i.e., Poisoned Seq2Seq learning and token representation learning) to support the multi-target attack of downstream code understanding and generation tasks. During the deployment phase, the implanted backdoors in the victim models can be activated by the designed triggers to achieve the targeted attack. We evaluate our approach on two code understanding tasks and three code generation tasks over seven datasets. Extensive experiments demonstrate that our approach can effectively and stealthily attack code-related downstream tasks.

Related papers

Secure Transfer Learning: Training Clean Models Against Backdoor in (Both) Pre-trained Encoders and Downstream Datasets [16.619809695639027]
Pre-training and downstream adaptation expose models to sophisticated backdoor embeddings at both the encoder and dataset levels. In this work, we investigate how to mitigate potential backdoor risks in resource-constrained transfer learning scenarios. We propose the Trusted Core (T-Core) Bootstrapping framework, which emphasizes the importance of pinpointing trustworthy data and neurons to enhance model security.
arXiv Detail & Related papers (2025-04-16T11:33:03Z)
Behavior Backdoor for Deep Learning Models [95.50787731231063]
We take the first step towards behavioral backdoor'' attack, which is defined as a behavior-triggered backdoor model training procedure. We propose the first pipeline of implementing behavior backdoor, i.e., the Quantification Backdoor (QB) attack. Experiments have been conducted on different models, datasets, and tasks, demonstrating the effectiveness of this novel backdoor attack.
arXiv Detail & Related papers (2024-12-02T10:54:02Z)
CodePurify: Defend Backdoor Attacks on Neural Code Models via Entropy-based Purification [19.570958294967536]
backdoor attacks can achieve nearly 100% attack success rates on many software engineering tasks. We propose CodePurify, a novel defense against backdoor attacks on code models through entropy-based purification. We extensively evaluate CodePurify against four advanced backdoor attacks across three representative tasks and two popular code models.
arXiv Detail & Related papers (2024-10-26T10:17:50Z)
Long-Tailed Backdoor Attack Using Dynamic Data Augmentation Operations [50.1394620328318]
Existing backdoor attacks mainly focus on balanced datasets. We propose an effective backdoor attack named Dynamic Data Augmentation Operation (D$2$AO) Our method can achieve the state-of-the-art attack performance while preserving the clean accuracy.
arXiv Detail & Related papers (2024-10-16T18:44:22Z)
TAPI: Towards Target-Specific and Adversarial Prompt Injection against Code LLMs [27.700010465702842]
This paper proposes a new attack paradigm, i.e., target-specific and adversarial prompt injection (TAPI) against Code LLMs. TAPI generates unreadable comments containing information about malicious instructions and hides them as triggers in the external source code. We successfully attack some famous deployed code completion integrated applications, including CodeGeex and Github Copilot.
arXiv Detail & Related papers (2024-07-12T10:59:32Z)
Indiscriminate Data Poisoning Attacks on Pre-trained Feature Extractors [26.36344184385407]
In this paper, we explore the threat of indiscriminate attacks on downstream tasks that apply pre-trained feature extractors. We propose two types of attacks: (1) the input space attacks, where we modify existing attacks to craft poisoned data in the input space; and (2) the feature targeted attacks, where we find poisoned features by treating the learned feature representations as a dataset. Our experiments examine such attacks in popular downstream tasks of fine-tuning on the same dataset and transfer learning that considers domain adaptation.
arXiv Detail & Related papers (2024-02-20T01:12:59Z)
Pre-trained Trojan Attacks for Visual Recognition [106.13792185398863]
Pre-trained vision models (PVMs) have become a dominant component due to their exceptional performance when fine-tuned for downstream tasks. We propose the Pre-trained Trojan attack, which embeds backdoors into a PVM, enabling attacks across various downstream vision tasks. We highlight the challenges posed by cross-task activation and shortcut connections in successful backdoor attacks.
arXiv Detail & Related papers (2023-12-23T05:51:40Z)
On the Difficulty of Defending Contrastive Learning against Backdoor Attacks [58.824074124014224]
We show how contrastive backdoor attacks operate through distinctive mechanisms. Our findings highlight the need for defenses tailored to the specificities of contrastive backdoor attacks.
arXiv Detail & Related papers (2023-12-14T15:54:52Z)
BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning [85.2564206440109]
This paper reveals the threats in this practical scenario that backdoor attacks can remain effective even after defenses. We introduce the emphtoolns attack, which is resistant to backdoor detection and model fine-tuning defenses.
arXiv Detail & Related papers (2023-11-20T02:21:49Z)
Stealthy Backdoor Attack for Code Models [19.272856932095966]
Existing backdoor attacks on code models use unstealthy and easy-to-detect triggers. This paper aims to investigate the vulnerability of code models with stealthy backdoor attacks. We find that around 85% of adaptive triggers in AFRAIDOOR bypass the detection in the defense process.
arXiv Detail & Related papers (2023-01-06T13:15:42Z)
Untargeted Backdoor Attack against Object Detection [69.63097724439886]
We design a poison-only backdoor attack in an untargeted manner, based on task characteristics. We show that, once the backdoor is embedded into the target model by our attack, it can trick the model to lose detection of any object stamped with our trigger patterns.
arXiv Detail & Related papers (2022-11-02T17:05:45Z)
On the Effectiveness of Adversarial Training against Backdoor Attacks [111.8963365326168]
A backdoored model always predicts a target class in the presence of a predefined trigger pattern. In general, adversarial training is believed to defend against backdoor attacks. We propose a hybrid strategy which provides satisfactory robustness across different backdoor attacks.
arXiv Detail & Related papers (2022-02-22T02:24:46Z)
FooBaR: Fault Fooling Backdoor Attack on Neural Network Training [5.639451539396458]
We explore a novel attack paradigm by injecting faults during the training phase of a neural network in a way that the resulting network can be attacked during deployment without the necessity of further faulting. We call such attacks fooling backdoors as the fault attacks at the training phase inject backdoors into the network that allow an attacker to produce fooling inputs.
arXiv Detail & Related papers (2021-09-23T09:43:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.