DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial
Natural Language Instructions
- URL: http://arxiv.org/abs/2312.04730v2
- Date: Tue, 12 Dec 2023 19:42:11 GMT
- Title: DeceptPrompt: Exploiting LLM-driven Code Generation via Adversarial
Natural Language Instructions
- Authors: Fangzhou Wu, Xiaogeng Liu, Chaowei Xiao
- Abstract summary: We introduce DeceptPrompt, an algorithm that can generate adversarial natural language instructions that drive the Code LLMs to generate functionality correct code with vulnerabilities.
When applying the optimized prefix/suffix, the attack success rate (ASR) will improve by average 50% compared with no prefix/suffix applying.
- Score: 27.489622263456983
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the advancement of Large Language Models (LLMs), significant progress
has been made in code generation, enabling LLMs to transform natural language
into programming code. These Code LLMs have been widely accepted by massive
users and organizations. However, a dangerous nature is hidden in the code,
which is the existence of fatal vulnerabilities. While some LLM providers have
attempted to address these issues by aligning with human guidance, these
efforts fall short of making Code LLMs practical and robust. Without a deep
understanding of the performance of the LLMs under the practical worst cases,
it would be concerning to apply them to various real-world applications. In
this paper, we answer the critical issue: Are existing Code LLMs immune to
generating vulnerable code? If not, what is the possible maximum severity of
this issue in practical deployment scenarios? In this paper, we introduce
DeceptPrompt, a novel algorithm that can generate adversarial natural language
instructions that drive the Code LLMs to generate functionality correct code
with vulnerabilities. DeceptPrompt is achieved through a systematic
evolution-based algorithm with a fine grain loss design. The unique advantage
of DeceptPrompt enables us to find natural prefix/suffix with totally benign
and non-directional semantic meaning, meanwhile, having great power in inducing
the Code LLMs to generate vulnerable code. This feature can enable us to
conduct the almost-worstcase red-teaming on these LLMs in a real scenario,
where users are using natural language. Our extensive experiments and analyses
on DeceptPrompt not only validate the effectiveness of our approach but also
shed light on the huge weakness of LLMs in the code generation task. When
applying the optimized prefix/suffix, the attack success rate (ASR) will
improve by average 50% compared with no prefix/suffix applying.
Related papers
- Aligning LLMs to Be Robust Against Prompt Injection [55.07562650579068]
We show that alignment can be a powerful tool to make LLMs more robust against prompt injection attacks.
Our method -- SecAlign -- first builds an alignment dataset by simulating prompt injection attacks.
Our experiments show that SecAlign robustifies the LLM substantially with a negligible hurt on model utility.
arXiv Detail & Related papers (2024-10-07T19:34:35Z) - RMCBench: Benchmarking Large Language Models' Resistance to Malicious Code [30.244754704562162]
There is no research evaluating the ability of LLMs to resist malicious code generation.
We conduct an empirical study on 11 representative LLMs to assess their ability to resist malicious code generation.
Our findings indicate that current LLMs have a limited ability to resist malicious code generation with an average refusal rate of 40.36% in text-to-code scenario and 11.52% in code-to-code scenario.
arXiv Detail & Related papers (2024-09-23T16:03:26Z) - zsLLMCode: An Effective Approach for Functional Code Embedding via LLM with Zero-Shot Learning [6.976968804436321]
Large language models (LLMs) have the capability of zero-shot learning, which does not require training or fine-tuning.
We propose zsLLMCode, a novel approach that generates functional code embeddings using LLMs.
arXiv Detail & Related papers (2024-09-23T01:03:15Z) - Combining LLM Code Generation with Formal Specifications and Reactive Program Synthesis [0.7580487359358722]
Large Language Models (LLMs) struggle with accuracy and are unsuitable for high-risk applications.
We introduce a solution that divides the code generation into two parts; one to be handled by an LLM and one to be handled by formal methods-based program synthesis.
arXiv Detail & Related papers (2024-09-18T15:59:06Z) - CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs.
Our studies reveal a new and universal safety vulnerability of these models against code input.
We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z) - Assured LLM-Based Software Engineering [51.003878077888686]
This paper is an outline of the content of the keynote by Mark Harman at the International Workshop on Interpretability, Robustness, and Benchmarking in Neural Software Engineering, Monday 15th April 2024, Lisbon, Portugal.
arXiv Detail & Related papers (2024-02-06T20:38:46Z) - Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs [65.2379940117181]
We introduce code prompting, a chain of prompts that transforms a natural language problem into code.
We find that code prompting exhibits a high-performance boost for multiple LLMs.
Our analysis of GPT 3.5 reveals that the code formatting of the input problem is essential for performance improvement.
arXiv Detail & Related papers (2024-01-18T15:32:24Z) - If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code
Empowers Large Language Models to Serve as Intelligent Agents [81.60906807941188]
Large language models (LLMs) are trained on a combination of natural language and formal language (code)
Code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity.
arXiv Detail & Related papers (2024-01-01T16:51:20Z) - Test-Case-Driven Programming Understanding in Large Language Models for
Better Code Generation [15.166827643436346]
muFiX is a novel prompting technique to improve the code generation performance of large language models (LLMs)
It first exploits test case analysis to obtain specification understanding and enables a self-improvement process.
muFiX further fixes the specification understanding towards the direction reducing the gap between the provided understanding and the actual understanding.
arXiv Detail & Related papers (2023-09-28T02:58:07Z) - Red Teaming Language Model Detectors with Language Models [114.36392560711022]
Large language models (LLMs) present significant safety and ethical risks if exploited by malicious users.
Recent works have proposed algorithms to detect LLM-generated text and protect LLMs.
We study two types of attack strategies: 1) replacing certain words in an LLM's output with their synonyms given the context; 2) automatically searching for an instructional prompt to alter the writing style of the generation.
arXiv Detail & Related papers (2023-05-31T10:08:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.