LLM-Enhanced Software Patch Localization
- URL: http://arxiv.org/abs/2409.06816v2
- Date: Fri, 13 Sep 2024 03:12:52 GMT
- Title: LLM-Enhanced Software Patch Localization
- Authors: Jinhong Yu, Yi Chen, Di Tang, Xiaozhong Liu, XiaoFeng Wang, Chen Wu, Haixu Tang,
- Abstract summary: Security patch localization (SPL) recommendation methods are leading approaches to address this.
We introduce LLM-SPL, a recommendation-based SPL approach that leverages the capabilities of the Large Language Model (LLM) to locate the security patch commit for a given CVE.
Our evaluation on a dataset of 1,915 CVEs associated with 2,461 patches demonstrates that LLM-SPL excels in ranking patch commits, surpassing the state-of-the-art method in terms of Recall.
- Score: 24.1593187492973
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Open source software (OSS) is integral to modern product development, and any vulnerability within it potentially compromises numerous products. While developers strive to apply security patches, pinpointing these patches among extensive OSS updates remains a challenge. Security patch localization (SPL) recommendation methods are leading approaches to address this. However, existing SPL models often falter when a commit lacks a clear association with its corresponding CVE, and do not consider a scenario that a vulnerability has multiple patches proposed over time before it has been fully resolved. To address these challenges, we introduce LLM-SPL, a recommendation-based SPL approach that leverages the capabilities of the Large Language Model (LLM) to locate the security patch commit for a given CVE. More specifically, we propose a joint learning framework, in which the outputs of LLM serves as additional features to aid our recommendation model in prioritizing security patches. Our evaluation on a dataset of 1,915 CVEs associated with 2,461 patches demonstrates that LLM-SPL excels in ranking patch commits, surpassing the state-of-the-art method in terms of Recall, while significantly reducing manual effort. Notably, for vulnerabilities requiring multiple patches, LLM-SPL significantly improves Recall by 22.83\%, NDCG by 19.41\%, and reduces manual effort by over 25\% when checking up to the top 10 rankings. The dataset and source code are available at \url{https://anonymous.4open.science/r/LLM-SPL-91F8}.
Related papers
- Are AI-Generated Fixes Secure? Analyzing LLM and Agent Patches on SWE-bench [9.229310642804036]
We present the first large-scale security analysis of LLM-generated patches using 20,000+ issues from the SWE-bench dataset.<n>We evaluate patches produced by a standalone LLM (Llama 3.3) and compare them to developer-written patches.<n>We also assess the security of patches generated by three top-performing agentic frameworks (OpenHands, AutoCodeRover, HoneyComb) on a subset of our data.
arXiv Detail & Related papers (2025-06-30T21:10:19Z) - SweRank: Software Issue Localization with Code Ranking [109.3289316191729]
SweRank is an efficient retrieve-and-rerank framework for software issue localization.<n>We construct SweLoc, a large-scale dataset curated from public GitHub repositories.<n>We show that SweRank achieves state-of-the-art performance, outperforming both prior ranking models and costly agent-based systems.
arXiv Detail & Related papers (2025-05-07T19:44:09Z) - AutoPatch: Multi-Agent Framework for Patching Real-World CVE Vulnerabilities [7.812032134834162]
Large Language Models (LLMs) have emerged as promising tools in software development.<n>Their knowledge is limited to a fixed cutoff date, making them prone to generating code vulnerable to newly disclosed CVEs.<n>We propose AutoPatch, a framework designed to patch vulnerable LLM-generated code.
arXiv Detail & Related papers (2025-05-07T07:49:05Z) - SWE-Fixer: Training Open-Source LLMs for Effective and Efficient GitHub Issue Resolution [56.9361004704428]
Large Language Models (LLMs) have demonstrated remarkable proficiency across a variety of complex tasks.
SWE-Fixer is a novel open-source framework designed to effectively and efficiently resolve GitHub issues.
We assess our approach on the SWE-Bench Lite and Verified benchmarks, achieving state-of-the-art performance among open-source models.
arXiv Detail & Related papers (2025-01-09T07:54:24Z) - Look Before You Leap: Enhancing Attention and Vigilance Regarding Harmful Content with GuidelineLLM [53.79753074854936]
Large language models (LLMs) are increasingly vulnerable to emerging jailbreak attacks.
This vulnerability poses significant risks to real-world applications.
We propose a novel defensive paradigm called GuidelineLLM.
arXiv Detail & Related papers (2024-12-10T12:42:33Z) - Iterative Self-Tuning LLMs for Enhanced Jailbreaking Capabilities [63.603861880022954]
We introduce ADV-LLM, an iterative self-tuning process that crafts adversarial LLMs with enhanced jailbreak ability.
Our framework significantly reduces the computational cost of generating adversarial suffixes while achieving nearly 100% ASR on various open-source LLMs.
It exhibits strong attack transferability to closed-source models, achieving 99% ASR on GPT-3.5 and 49% ASR on GPT-4, despite being optimized solely on Llama3.
arXiv Detail & Related papers (2024-10-24T06:36:12Z) - APILOT: Navigating Large Language Models to Generate Secure Code by Sidestepping Outdated API Pitfalls [15.865915079829943]
APILOT maintains a realtime, quickly updatable dataset of outdated APIs.
It uses an augmented generation method to navigate LLMs in generating secure, version-aware code.
It can reduce outdated code recommendations by 89.42% on average with limited performance overhead.
arXiv Detail & Related papers (2024-09-25T00:37:40Z) - VulnLLMEval: A Framework for Evaluating Large Language Models in Software Vulnerability Detection and Patching [0.9208007322096533]
Large Language Models (LLMs) have shown promise in tasks like code translation.
This paper introduces VulnLLMEval, a framework designed to assess the performance of LLMs in identifying and patching vulnerabilities in C code.
Our study includes 307 real-world vulnerabilities extracted from the Linux kernel.
arXiv Detail & Related papers (2024-09-16T22:00:20Z) - Automated Software Vulnerability Patching using Large Language Models [24.958856670970366]
We leverage the power and merits of pre-trained large language models (LLMs) to enable automated vulnerability patching.
To elicit LLMs to effectively reason about vulnerable code behaviors, we introduce adaptive prompting on LLMs.
Our evaluation of LLM on real-world vulnerable code including zeroday vulnerabilities demonstrates its superior performance to both existing prompting methods and state-of-the-art non-LLM-based techniques.
arXiv Detail & Related papers (2024-08-24T14:51:50Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - Defensive Prompt Patch: A Robust and Interpretable Defense of LLMs against Jailbreak Attacks [59.46556573924901]
This paper introduces Defensive Prompt Patch (DPP), a novel prompt-based defense mechanism for large language models (LLMs)
Unlike previous approaches, DPP is designed to achieve a minimal Attack Success Rate (ASR) while preserving the high utility of LLMs.
Empirical results conducted on LLAMA-2-7B-Chat and Mistral-7B-Instruct-v0.2 models demonstrate the robustness and adaptability of DPP.
arXiv Detail & Related papers (2024-05-30T14:40:35Z) - LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement [79.31084387589968]
Pretrained large language models (LLMs) are currently state-of-the-art for solving the vast majority of natural language processing tasks.
We propose LLM2LLM, a data augmentation strategy that uses a teacher LLM to enhance a small seed dataset.
We achieve improvements up to 24.2% on the GSM8K dataset, 32.6% on CaseHOLD, 32.0% on SNIPS, 52.6% on TREC and 39.8% on SST-2 over regular fine-tuning in the low-data regime.
arXiv Detail & Related papers (2024-03-22T08:57:07Z) - Just-in-Time Detection of Silent Security Patches [7.840762542485285]
Security patches can be em silent, i.e., they do not always come with comprehensive advisories such as CVEs.
This lack of transparency leaves users oblivious to available security updates, providing ample opportunity for attackers to exploit unpatched vulnerabilities.
We propose to leverage large language models (LLMs) to augment patch information with generated code change explanations.
arXiv Detail & Related papers (2023-12-02T22:53:26Z) - Fake Alignment: Are LLMs Really Aligned Well? [91.26543768665778]
This study investigates the substantial discrepancy in performance between multiple-choice questions and open-ended questions.
Inspired by research on jailbreak attack patterns, we argue this is caused by mismatched generalization.
arXiv Detail & Related papers (2023-11-10T08:01:23Z) - Multilevel Semantic Embedding of Software Patches: A Fine-to-Coarse
Grained Approach Towards Security Patch Detection [6.838615442552715]
We introduce a multilevel Semantic Embedder for security patch detection, termed MultiSEM.
This model harnesses word-centric vectors at a fine-grained level, emphasizing the significance of individual words.
We further enrich this representation by assimilating patch descriptions to obtain a holistic semantic portrait.
arXiv Detail & Related papers (2023-08-29T11:41:21Z) - Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs [59.596335292426105]
This paper collects the first open-source dataset to evaluate safeguards in large language models.
We train several BERT-like classifiers to achieve results comparable with GPT-4 on automatic safety evaluation.
arXiv Detail & Related papers (2023-08-25T14:02:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.