CommitShield: Tracking Vulnerability Introduction and Fix in Version Control Systems
- URL: http://arxiv.org/abs/2501.03626v1
- Date: Tue, 07 Jan 2025 08:52:55 GMT
- Title: CommitShield: Tracking Vulnerability Introduction and Fix in Version Control Systems
- Authors: Zhaonan Wu, Yanjie Zhao, Chen Wei, Zirui Wan, Yue Liu, Haoyu Wang,
- Abstract summary: CommitShield is a tool for detecting vulnerabilities in code commits.
It combines the code analysis capabilities of static analysis tools with the natural language and code understanding capabilities of large language models.
We show that CommitShield improves recall by 76%-87% over state-of-the-art methods in the vulnerability fix detection task.
- Score: 15.037460085046806
- License:
- Abstract: Version control systems are commonly used to manage open-source software, in which each commit may introduce new vulnerabilities or fix existing ones. Researchers have developed various tools for detecting vulnerabilities in code commits, but their performance is limited by factors such as neglecting descriptive data and challenges in accurately identifying vulnerability introductions. To overcome these limitations, we propose CommitShield, which combines the code analysis capabilities of static analysis tools with the natural language and code understanding capabilities of large language models (LLMs) to enhance the accuracy of vulnerability introduction and fix detection by generating precise descriptions and obtaining rich patch contexts. We evaluate CommitShield using the newly constructed vulnerability repair dataset, CommitVulFix, and a cleaned vulnerability introduction dataset. Experimental results indicate that CommitShield improves recall by 76%-87% over state-of-the-art methods in the vulnerability fix detection task, and its F1-score improves by 15%-27% in the vulnerability introduction detection task.
Related papers
- In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models [104.94706600050557]
Text-to-image (T2I) models have shown remarkable progress, but their potential to generate harmful content remains a critical concern in the ML community.
We propose ICER, a novel red-teaming framework that generates interpretable and semantic meaningful problematic prompts.
Our work provides crucial insights for developing more robust safety mechanisms in T2I systems.
arXiv Detail & Related papers (2024-11-25T04:17:24Z) - CRepair: CVAE-based Automatic Vulnerability Repair Technology [1.147605955490786]
Software vulnerabilities pose significant threats to the integrity, security, and reliability of modern software and its application data.
To address the challenges of vulnerability repair, researchers have proposed various solutions, with learning-based automatic vulnerability repair techniques gaining widespread attention.
This paper proposes CRepair, a CVAE-based automatic vulnerability repair technology aimed at fixing security vulnerabilities in system code.
arXiv Detail & Related papers (2024-11-08T12:55:04Z) - The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition.
Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies.
We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z) - LLM-Enhanced Static Analysis for Precise Identification of Vulnerable OSS Versions [12.706661324384319]
Open-source software (OSS) has experienced a surge in popularity, attributed to its collaborative development model and cost-effective nature.
The adoption of specific software versions in development projects may introduce security risks when these versions bring along vulnerabilities.
Current methods of identifying vulnerable versions typically analyze and trace the code involved in vulnerability patches using static analysis with pre-defined rules.
This paper presents Vercation, an approach designed to identify vulnerable versions of OSS written in C/C++.
arXiv Detail & Related papers (2024-08-14T06:43:06Z) - Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs [60.32717556756674]
This paper introduces a systematic evaluation framework to assess Large Language Models in detecting cryptographic misuses.
Our in-depth analysis of 11,940 LLM-generated reports highlights that the inherent instabilities in LLMs can lead to over half of the reports being false positives.
The optimized approach achieves a remarkable detection rate of nearly 90%, surpassing traditional methods and uncovering previously unknown misuses in established benchmarks.
arXiv Detail & Related papers (2024-07-23T15:31:26Z) - VulDetectBench: Evaluating the Deep Capability of Vulnerability Detection with Large Language Models [12.465060623389151]
This study introduces a new benchmark, VulDetectBench, to assess the vulnerability detection capabilities of Large Language Models (LLMs)
The benchmark comprehensively evaluates LLM's ability to identify, classify, and locate vulnerabilities through five tasks of increasing difficulty.
Our benchmark effectively evaluates the capabilities of various LLMs at different levels in the specific task of vulnerability detection, providing a foundation for future research and improvements in this critical area of code security.
arXiv Detail & Related papers (2024-06-11T13:42:57Z) - The Vulnerability Is in the Details: Locating Fine-grained Information of Vulnerable Code Identified by Graph-based Detectors [33.395068754566935]
VULEXPLAINER is a tool for locating vulnerability-critical code lines from coarse-level vulnerable code snippets.
It can flag the vulnerability-triggering code statements with an accuracy of around 90% against eight common C/C++ vulnerabilities.
arXiv Detail & Related papers (2024-01-05T10:15:04Z) - Exploiting Library Vulnerability via Migration Based Automating Test
Generation [16.39796265296833]
In software development, developers extensively utilize third-party libraries to avoid implementing existing functionalities.
Vulnerability exploits, as code snippets provided for reproducing vulnerabilities after disclosure, contain a wealth of vulnerability-related information.
This study proposes a new method based on vulnerability exploits, called VESTA, which provides vulnerability exploit tests as the basis for developers to decide whether to update dependencies.
arXiv Detail & Related papers (2023-12-15T06:46:45Z) - REEF: A Framework for Collecting Real-World Vulnerabilities and Fixes [40.401211102969356]
We propose an automated collecting framework REEF to collect REal-world vulnErabilities and Fixes from open-source repositories.
We develop a multi-language crawler to collect vulnerabilities and their fixes, and design metrics to filter for high-quality vulnerability-fix pairs.
Through extensive experiments, we demonstrate that our approach can collect high-quality vulnerability-fix pairs and generate strong explanations.
arXiv Detail & Related papers (2023-09-15T02:50:08Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z) - Autosploit: A Fully Automated Framework for Evaluating the
Exploitability of Security Vulnerabilities [47.748732208602355]
Autosploit is an automated framework for evaluating the exploitability of vulnerabilities.
It automatically tests the exploits on different configurations of the environment.
It is able to identify the system properties that affect the ability to exploit a vulnerability in both noiseless and noisy environments.
arXiv Detail & Related papers (2020-06-30T18:49:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.