Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning
- URL: http://arxiv.org/abs/2409.18395v1
- Date: Fri, 27 Sep 2024 02:25:29 GMT
- Title: Code Vulnerability Repair with Large Language Model using Context-Aware Prompt Tuning
- Authors: Arshiya Khan, Guannan Liu, Xing Gao,
- Abstract summary: Large Language Models (LLMs) have shown significant challenges in detecting and repairing vulnerable code.
In this study, we utilize GitHub Copilot as the LLM and focus on buffer overflow vulnerabilities.
Our experiments reveal a notable gap in Copilot's abilities when dealing with buffer overflow vulnerabilities, with a 76% vulnerability detection rate but only a 15% vulnerability repair rate.
- Score: 5.1071146597039245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Large Language Models (LLMs) have shown significant challenges in detecting and repairing vulnerable code, particularly when dealing with vulnerabilities involving multiple aspects, such as variables, code flows, and code structures. In this study, we utilize GitHub Copilot as the LLM and focus on buffer overflow vulnerabilities. Our experiments reveal a notable gap in Copilot's abilities when dealing with buffer overflow vulnerabilities, with a 76% vulnerability detection rate but only a 15% vulnerability repair rate. To address this issue, we propose context-aware prompt tuning techniques designed to enhance LLM performance in repairing buffer overflow. By injecting a sequence of domain knowledge about the vulnerability, including various security and code contexts, we demonstrate that Copilot's successful repair rate increases to 63%, representing more than four times the improvement compared to repairs without domain knowledge.
Related papers
- CommitShield: Tracking Vulnerability Introduction and Fix in Version Control Systems [15.037460085046806]
CommitShield is a tool for detecting vulnerabilities in code commits.
It combines the code analysis capabilities of static analysis tools with the natural language and code understanding capabilities of large language models.
We show that CommitShield improves recall by 76%-87% over state-of-the-art methods in the vulnerability fix detection task.
arXiv Detail & Related papers (2025-01-07T08:52:55Z) - There are More Fish in the Sea: Automated Vulnerability Repair via Binary Templates [4.907610470063863]
We propose a template-based automated vulnerability repair approach for Java binaries.
Experiments on the Vul4J dataset demonstrate that TemVUR successfully repairs 11 vulnerabilities.
To assess the generalizability of TemVUR, we curate the ManyVuls4J dataset.
arXiv Detail & Related papers (2024-11-27T06:59:45Z) - How Well Do Large Language Models Serve as End-to-End Secure Code Producers? [42.119319820752324]
We studied GPT-3.5 and GPT-4's capability to identify and repair vulnerabilities in the code generated by four popular LLMs.
By manually or automatically reviewing 4,900 pieces of code, our study reveals that large language models lack awareness of scenario-relevant security risks.
To address the limitation of a single round of repair, we developed a lightweight tool that prompts LLMs to construct safer source code.
arXiv Detail & Related papers (2024-08-20T02:42:29Z) - Generalization-Enhanced Code Vulnerability Detection via Multi-Task Instruction Fine-Tuning [16.54022485688803]
VulLLM is a novel framework that integrates multi-task learning with Large Language Models (LLMs) to effectively mine deep-seated vulnerability features.
The experiments conducted on six large datasets demonstrate that VulLLM surpasses seven state-of-the-art models in terms of effectiveness, generalization, and robustness.
arXiv Detail & Related papers (2024-06-06T03:29:05Z) - NAVRepair: Node-type Aware C/C++ Code Vulnerability Repair [14.152755184229374]
NAVRepair is a novel framework that combines the node-type information extracted fromASTs with error types, specifically targeting C/C++ vulnerabilities.
We achieve a 26% higher accuracy compared to an existing LLM-based C/C++ vulnerability repair method.
arXiv Detail & Related papers (2024-05-08T11:58:55Z) - Vulnerability Detection with Code Language Models: How Far Are We? [40.455600722638906]
PrimeVul is a new dataset for training and evaluating code LMs for vulnerability detection.
It incorporates a novel set of data labeling techniques that achieve comparable label accuracy to human-verified benchmarks.
It also implements a rigorous data de-duplication and chronological data splitting strategy to mitigate data leakage issues.
arXiv Detail & Related papers (2024-03-27T14:34:29Z) - Multi-LLM Collaboration + Data-Centric Innovation = 2x Better
Vulnerability Repair [14.920535179015006]
VulMaster is a Transformer-based neural network model that excels at generating vulnerability repairs through data-centric innovation.
We evaluate VulMaster on a real-world C/C++ vulnerability repair dataset comprising 1,754 projects with 5,800 vulnerable functions.
arXiv Detail & Related papers (2024-01-27T16:51:52Z) - REEF: A Framework for Collecting Real-World Vulnerabilities and Fixes [40.401211102969356]
We propose an automated collecting framework REEF to collect REal-world vulnErabilities and Fixes from open-source repositories.
We develop a multi-language crawler to collect vulnerabilities and their fixes, and design metrics to filter for high-quality vulnerability-fix pairs.
Through extensive experiments, we demonstrate that our approach can collect high-quality vulnerability-fix pairs and generate strong explanations.
arXiv Detail & Related papers (2023-09-15T02:50:08Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - CodeLMSec Benchmark: Systematically Evaluating and Finding Security
Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks.
Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities.
This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.