Related papers: VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models

VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models

URL: http://arxiv.org/abs/2507.16685v1
Date: Tue, 22 Jul 2025 15:18:44 GMT
Title: VulGuard: An Unified Tool for Evaluating Just-In-Time Vulnerability Prediction Models
Authors: Duong Nguyen, Manh Tran-Duc, Thanh Le-Cong, Triet Huynh Minh Le, M. Ali Babar, Quyet-Thang Huynh,
Abstract summary: VulGuard is an automated tool designed to streamline the extraction, processing, and analysis of commits from GitHub repositories for vulnerability prediction (JIT-VP) research.<n>It automatically mines commit histories, extracts fine-grained code changes, commit messages, and software engineering metrics, and formats them for downstream analysis.
Score: 3.4299920908334673
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present VulGuard, an automated tool designed to streamline the extraction, processing, and analysis of commits from GitHub repositories for Just-In-Time vulnerability prediction (JIT-VP) research. VulGuard automatically mines commit histories, extracts fine-grained code changes, commit messages, and software engineering metrics, and formats them for downstream analysis. In addition, it integrates several state-of-the-art vulnerability prediction models, allowing researchers to train, evaluate, and compare models with minimal setup. By supporting both repository-scale mining and model-level experimentation within a unified framework, VulGuard addresses key challenges in reproducibility and scalability in software security research. VulGuard can also be easily integrated into the CI/CD pipeline. We demonstrate the effectiveness of the tool in two influential open-source projects, FFmpeg and the Linux kernel, highlighting its potential to accelerate real-world JIT-VP research and promote standardized benchmarking. A demo video is available at: https://youtu.be/j96096-pxbs

Related papers

A Survey on Model Extraction Attacks and Defenses for Large Language Models [55.60375624503877]
Model extraction attacks pose significant security threats to deployed language models.<n>This survey provides a comprehensive taxonomy of extraction attacks and defenses, categorizing attacks into functionality extraction, training data extraction, and prompt-targeted attacks.<n>We examine defense mechanisms organized into model protection, data privacy protection, and prompt-targeted strategies, evaluating their effectiveness across different deployment scenarios.
arXiv Detail & Related papers (2025-06-26T22:02:01Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
A Rusty Link in the AI Supply Chain: Detecting Evil Configurations in Model Repositories [9.095642871258455]
We present the first comprehensive study of malicious configurations on Hugging Face.<n>In particular, configuration files originally intended to set up models can be exploited to execute unauthorized code.
arXiv Detail & Related papers (2025-05-02T07:16:20Z)
T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models [88.63040835652902]
Text to video models are vulnerable to jailbreak attacks, where specially crafted prompts bypass safety mechanisms and lead to the generation of harmful or unsafe content.<n>We propose T2VShield, a comprehensive and model agnostic defense framework designed to protect text to video models from jailbreak threats.<n>Our method systematically analyzes the input, model, and output stages to identify the limitations of existing defenses.
arXiv Detail & Related papers (2025-04-22T01:18:42Z)
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute [61.00662702026523]
We propose a unified Test-Time Compute scaling framework that leverages increased inference-time instead of larger models.<n>Our framework incorporates two complementary strategies: internal TTC and external TTC.<n>We demonstrate our textbf32B model achieves a 46% issue resolution rate, surpassing significantly larger models such as DeepSeek R1 671B and OpenAI o1.
arXiv Detail & Related papers (2025-03-31T07:31:32Z)
Automating the Detection of Code Vulnerabilities by Analyzing GitHub Issues [6.6681265451722895]
We introduce a new dataset specifically designed for classifying GitHub issues relevant to vulnerability detection.<n>Results demonstrate the potential of this approach for real-world application in early vulnerability detection.<n>This work has the potential to enhance the security of open-source software ecosystems.
arXiv Detail & Related papers (2025-01-09T14:13:39Z)
EnStack: An Ensemble Stacking Framework of Large Language Models for Enhanced Vulnerability Detection in Source Code [1.9374282535132379]
We introduce EnStack, a novel ensemble stacking framework that enhances vulnerability detection using natural language processing (NLP) techniques. Our approach synergizes multiple pre-trained large language models (LLMs) specialized in code understanding. meta-classifiers consolidate the strengths of each LLM, resulting in a comprehensive model that excels in detecting subtle and complex vulnerabilities.
arXiv Detail & Related papers (2024-11-25T16:47:10Z)
Beyond Static Tools: Evaluating Large Language Models for Cryptographic Misuse Detection [0.30693357740321775]
GPT 4-o-mini surpasses current state-of-the-art static analysis tools on the CryptoAPI and MASC datasets. This study highlights the comparative strengths and limitations of static analysis versus LLM-driven approaches.
arXiv Detail & Related papers (2024-11-14T19:33:08Z)
VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model [13.96251273677855]
VulLibGen is a method to directly generate affected packages. It has an average accuracy of 0.806 for identifying vulnerable packages. We have submitted 60 vulnerability, affected package> pairs to GitHub Advisory.
arXiv Detail & Related papers (2023-08-09T02:02:46Z)
CodeLMSec Benchmark: Systematically Evaluating and Finding Security Vulnerabilities in Black-Box Code Language Models [58.27254444280376]
Large language models (LLMs) for automatic code generation have achieved breakthroughs in several programming tasks. Training data for these models is usually collected from the Internet (e.g., from open-source repositories) and is likely to contain faults and security vulnerabilities. This unsanitized training data can cause the language models to learn these vulnerabilities and propagate them during the code generation procedure.
arXiv Detail & Related papers (2023-02-08T11:54:07Z)
VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection [1.256413718364189]
VulBERTa is a deep learning approach to detect security vulnerabilities in source code. Our approach pre-trains a RoBERTa model with a custom tokenisation pipeline on real-world code from open-source C/C++ projects. We evaluate our approach on binary and multi-class vulnerability detection tasks across several datasets.
arXiv Detail & Related papers (2022-05-25T00:56:43Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)

This list is automatically generated from the titles and abstracts of the papers in this site.