Related papers: A Disguised Wolf Is More Harmful Than a Toothless Tiger: Adaptive Malicious Code Injection Backdoor Attack Leveraging User Behavior as Triggers

A Disguised Wolf Is More Harmful Than a Toothless Tiger: Adaptive Malicious Code Injection Backdoor Attack Leveraging User Behavior as Triggers

URL: http://arxiv.org/abs/2408.10334v1
Date: Mon, 19 Aug 2024 18:18:04 GMT
Title: A Disguised Wolf Is More Harmful Than a Toothless Tiger: Adaptive Malicious Code Injection Backdoor Attack Leveraging User Behavior as Triggers
Authors: Shangxi Wu, Jitao Sang,
Abstract summary: We first present the game-theoretic model that focuses on security issues in code generation scenarios. This framework outlines possible scenarios and patterns where attackers could spread malicious code models to create security threats. We also pointed out for the first time that the attackers can use backdoor attacks to dynamically adjust the timing of malicious code injection.
Score: 15.339528712960021
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In recent years, large language models (LLMs) have made significant progress in the field of code generation. However, as more and more users rely on these models for software development, the security risks associated with code generation models have become increasingly significant. Studies have shown that traditional deep learning robustness issues also negatively impact the field of code generation. In this paper, we first present the game-theoretic model that focuses on security issues in code generation scenarios. This framework outlines possible scenarios and patterns where attackers could spread malicious code models to create security threats. We also pointed out for the first time that the attackers can use backdoor attacks to dynamically adjust the timing of malicious code injection, which will release varying degrees of malicious code depending on the skill level of the user. Through extensive experiments on leading code generation models, we validate our proposed game-theoretic model and highlight the significant threats that these new attack scenarios pose to the safe use of code models.

Related papers

ShadowCode: Towards (Automatic) External Prompt Injection Attack against Code LLMs [56.46702494338318]
This paper introduces a new attack paradigm: (automatic) external prompt injection against code-oriented large language models.<n>We propose ShadowCode, a simple yet effective method that automatically generates induced perturbations based on code simulation.<n>We evaluate our method across 13 distinct malicious objectives, generating 31 threat cases spanning three popular programming languages.
arXiv Detail & Related papers (2024-07-12T10:59:32Z)
An LLM-Assisted Easy-to-Trigger Backdoor Attack on Code Completion Models: Injecting Disguised Vulnerabilities against Strong Detection [17.948513691133037]
We introduce CodeBreaker, a pioneering LLM-assisted backdoor attack framework on code completion models. By integrating malicious payloads directly into the source code with minimal transformation, CodeBreaker challenges current security measures.
arXiv Detail & Related papers (2024-06-10T22:10:05Z)
Principles of Designing Robust Remote Face Anti-Spoofing Systems [60.05766968805833]
This paper sheds light on the vulnerabilities of state-of-the-art face anti-spoofing methods against digital attacks. It presents a comprehensive taxonomy of common threats encountered in face anti-spoofing systems.
arXiv Detail & Related papers (2024-06-06T02:05:35Z)
Trojans in Large Language Models of Code: A Critical Review through a Trigger-Based Taxonomy [11.075592348442225]
Large language models (LLMs) have provided a lot of exciting new capabilities in software development. The opaque nature of these models makes them difficult to reason about and inspect. This work presents an overview of the current state-of-the-art trojan attacks on large language models of code.
arXiv Detail & Related papers (2024-05-05T06:43:52Z)
Assessing Cybersecurity Vulnerabilities in Code Large Language Models [18.720986922660543]
EvilInstructCoder is a framework designed to assess the cybersecurity vulnerabilities of instruction-tuned Code LLMs to adversarial attacks. It incorporates practical threat models to reflect real-world adversaries with varying capabilities. We conduct a comprehensive investigation into the exploitability of instruction tuning for coding tasks using three state-of-the-art Code LLM models.
arXiv Detail & Related papers (2024-04-29T10:14:58Z)
Privacy Backdoors: Enhancing Membership Inference through Poisoning Pre-trained Models [112.48136829374741]
In this paper, we unveil a new vulnerability: the privacy backdoor attack. When a victim fine-tunes a backdoored model, their training data will be leaked at a significantly higher rate than if they had fine-tuned a typical model. Our findings highlight a critical privacy concern within the machine learning community and call for a reevaluation of safety protocols in the use of open-source pre-trained models.
arXiv Detail & Related papers (2024-04-01T16:50:54Z)
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs. Our studies reveal a new and universal safety vulnerability of these models against code input. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z)
Poisoning Programs by Un-Repairing Code: Security Concerns of AI-generated Code [0.9790236766474201]
We identify a novel data poisoning attack that results in the generation of vulnerable code. We then devise an extensive evaluation of how these attacks impact state-of-the-art models for code generation.
arXiv Detail & Related papers (2024-03-11T12:47:04Z)
Gotcha! This Model Uses My Code! Evaluating Membership Leakage Risks in Code Models [12.214474083372389]
We propose Gotcha, a novel membership inference attack method specifically for code models. We show that Gotcha can predict the data membership with a high true positive rate of 0.95 and a low false positive rate of 0.10. This study calls for more attention to understanding the privacy of code models.
arXiv Detail & Related papers (2023-10-02T12:50:43Z)
Adversarial Attacks on Code Models with Discriminative Graph Patterns [10.543744143786519]
We propose a novel adversarial attack framework, GraphCodeAttack, to better evaluate the robustness of code models. Given a target code model, GraphCodeAttack automatically mines important code patterns, which can influence the model's decisions. To effectively synthesize attacks from AST patterns, GraphCodeAttack uses a separate pre-trained code model to fill in the ASTs with concrete code snippets.
arXiv Detail & Related papers (2023-08-22T03:40:34Z)
AdaptGuard: Defending Against Universal Attacks for Model Adaptation [129.2012687550069]
We study the vulnerability to universal attacks transferred from the source domain during model adaptation algorithms. We propose a model preprocessing framework, named AdaptGuard, to improve the security of model adaptation algorithms.
arXiv Detail & Related papers (2023-03-19T07:53:31Z)
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z)
Learning to Attack: Towards Textual Adversarial Attacking in Real-world Situations [81.82518920087175]
Adversarial attacking aims to fool deep neural networks with adversarial examples. We propose a reinforcement learning based attack model, which can learn from attack history and launch attacks more efficiently.
arXiv Detail & Related papers (2020-09-19T09:12:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.