PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code
- URL: http://arxiv.org/abs/2408.08619v1
- Date: Fri, 16 Aug 2024 09:19:27 GMT
- Title: PatUntrack: Automated Generating Patch Examples for Issue Reports without Tracked Insecure Code
- Authors: Ziyou Jiang, Lin Shi, Guowei Yang, Qing Wang,
- Abstract summary: We propose PatUntrack to automatically generate patch examples from vulnerable issue reports (IRs) without tracked insecure code.
It first generates the completed description of the Vulnerability-Triggering Path (VTP) from vulnerable IRs.
It then corrects hallucinations in the VTP description with external golden knowledge.
Finally, it generates Top-K pairs of Insecure Code and Patch Example based on the corrected VTP description.
- Score: 6.6821370571514525
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Security patches are essential for enhancing the stability and robustness of projects in the software community. While vulnerabilities are officially expected to be patched before being disclosed, patching vulnerabilities is complicated and remains a struggle for many organizations. To patch vulnerabilities, security practitioners typically track vulnerable issue reports (IRs), and analyze their relevant insecure code to generate potential patches. However, the relevant insecure code may not be explicitly specified and practitioners cannot track the insecure code in the repositories, thus limiting their ability to generate patches. In such cases, providing examples of insecure code and the corresponding patches would benefit the security developers to better locate and fix the insecure code. In this paper, we propose PatUntrack to automatically generating patch examples from IRs without tracked insecure code. It auto-prompts Large Language Models (LLMs) to make them applicable to analyze the vulnerabilities. It first generates the completed description of the Vulnerability-Triggering Path (VTP) from vulnerable IRs. Then, it corrects hallucinations in the VTP description with external golden knowledge. Finally, it generates Top-K pairs of Insecure Code and Patch Example based on the corrected VTP description. To evaluate the performance, we conducted experiments on 5,465 vulnerable IRs. The experimental results show that PatUntrack can obtain the highest performance and improve the traditional LLM baselines by +14.6% (Fix@10) on average in patch example generation. Furthermore, PatUntrack was applied to generate patch examples for 76 newly disclosed vulnerable IRs. 27 out of 37 replies from the authors of these IRs confirmed the usefulness of the patch examples generated by PatUntrack, indicating that they can benefit from these examples for patching the vulnerabilities.
Related papers
- Fixing Security Vulnerabilities with AI in OSS-Fuzz [9.730566646484304]
OSS-Fuzz is the most significant and widely used infrastructure for continuous validation of open source systems.
We customise the well-known AutoCodeRover agent for fixing security vulnerabilities.
Our experience with OSS-Fuzz vulnerability data shows that LLM agent autonomy is useful for successful security patching.
arXiv Detail & Related papers (2024-11-03T16:20:32Z) - PatchFinder: A Two-Phase Approach to Security Patch Tracing for Disclosed Vulnerabilities in Open-Source Software [15.867607171943698]
We propose a two-phase framework with end-to-end correlation learning for better-tracing security patches.
PatchFinder achieves a Recall@10 of 80.63% and a Mean Reciprocal Rank (MRR) of 0.7951.
When applying PatchFinder in practice, we initially identified 533 patch commits and submitted them to the official, 482 of which have been confirmed by CVE Numbering Authorities.
arXiv Detail & Related papers (2024-07-24T07:46:24Z) - Path-wise Vulnerability Mitigation [3.105656247358225]
This paper describes an approach called PAVER that generates and inserts mitigation patches at the level of program paths.
For each candidate patch location, PAVER generates and inserts a mitigation patch, and tests the patched program to assess the side-effects.
We evaluate the prototype of PAVER on real world vulnerabilities and the evaluation shows that our path-wise vulnerability mitigation patches can achieve minimum side-effects.
arXiv Detail & Related papers (2024-05-25T22:58:37Z) - RAP-Gen: Retrieval-Augmented Patch Generation with CodeT5 for Automatic
Program Repair [75.40584530380589]
We propose a novel Retrieval-Augmented Patch Generation framework (RAP-Gen)
RAP-Gen explicitly leveraging relevant fix patterns retrieved from a list of previous bug-fix pairs.
We evaluate RAP-Gen on three benchmarks in two programming languages, including the TFix benchmark in JavaScript, and Code Refinement and Defects4J benchmarks in Java.
arXiv Detail & Related papers (2023-09-12T08:52:56Z) - Multilevel Semantic Embedding of Software Patches: A Fine-to-Coarse
Grained Approach Towards Security Patch Detection [6.838615442552715]
We introduce a multilevel Semantic Embedder for security patch detection, termed MultiSEM.
This model harnesses word-centric vectors at a fine-grained level, emphasizing the significance of individual words.
We further enrich this representation by assimilating patch descriptions to obtain a holistic semantic portrait.
arXiv Detail & Related papers (2023-08-29T11:41:21Z) - On the Security Blind Spots of Software Composition Analysis [46.1389163921338]
We present a novel approach to detect vulnerable clones in the Maven repository.
We retrieve over 53k potential vulnerable clones from Maven Central.
We detect 727 confirmed vulnerable clones and synthesize a testable proof-of-vulnerability project for each of those.
arXiv Detail & Related papers (2023-06-08T20:14:46Z) - DRSM: De-Randomized Smoothing on Malware Classifier Providing Certified
Robustness [58.23214712926585]
We develop a certified defense, DRSM (De-Randomized Smoothed MalConv), by redesigning the de-randomized smoothing technique for the domain of malware detection.
Specifically, we propose a window ablation scheme to provably limit the impact of adversarial bytes while maximally preserving local structures of the executables.
We are the first to offer certified robustness in the realm of static detection of malware executables.
arXiv Detail & Related papers (2023-03-20T17:25:22Z) - Detecting Security Patches via Behavioral Data in Code Repositories [11.052678122289871]
We show a system to automatically identify security patches using only the developer behavior in the Git repository.
We showed we can reveal concealed security patches with an accuracy of 88.3% and F1 Score of 89.8%.
arXiv Detail & Related papers (2023-02-04T06:43:07Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Segment and Complete: Defending Object Detectors against Adversarial
Patch Attacks with Robust Patch Detection [142.24869736769432]
Adversarial patch attacks pose a serious threat to state-of-the-art object detectors.
We propose Segment and Complete defense (SAC), a framework for defending object detectors against patch attacks.
We show SAC can significantly reduce the targeted attack success rate of physical patch attacks.
arXiv Detail & Related papers (2021-12-08T19:18:48Z) - (De)Randomized Smoothing for Certifiable Defense against Patch Attacks [136.79415677706612]
We introduce a certifiable defense against patch attacks that guarantees for a given image and patch attack size.
Our method is related to the broad class of randomized smoothing robustness schemes.
Our results effectively establish a new state-of-the-art of certifiable defense against patch attacks on CIFAR-10 and ImageNet.
arXiv Detail & Related papers (2020-02-25T08:39:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.