Related papers: Explaining Software Vulnerabilities with Large Language Models

Explaining Software Vulnerabilities with Large Language Models

URL: http://arxiv.org/abs/2511.04179v1
Date: Thu, 06 Nov 2025 08:30:56 GMT
Title: Explaining Software Vulnerabilities with Large Language Models
Authors: Oshando Johnson, Alexandra Fomina, Ranjith Krishnamurthy, Vaibhav Chaudhari, Rohith Kumar Shanmuganathan, Eric Bodden,
Abstract summary: We present SAFE, an Integrated Development Environment (IDE) plugin that leverages GPT-4o to explain the causes, impacts, and mitigation strategies of vulnerabilities detected by SAST tools.<n>Our expert user study findings indicate that the explanations generated by SAFE can significantly assist beginner to intermediate developers in understanding and addressing security vulnerabilities.
Score: 35.74179339347328
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The prevalence of security vulnerabilities has prompted companies to adopt static application security testing (SAST) tools for vulnerability detection. Nevertheless, these tools frequently exhibit usability limitations, as their generic warning messages do not sufficiently communicate important information to developers, resulting in misunderstandings or oversight of critical findings. In light of recent developments in Large Language Models (LLMs) and their text generation capabilities, our work investigates a hybrid approach that uses LLMs to tackle the SAST explainability challenges. In this paper, we present SAFE, an Integrated Development Environment (IDE) plugin that leverages GPT-4o to explain the causes, impacts, and mitigation strategies of vulnerabilities detected by SAST tools. Our expert user study findings indicate that the explanations generated by SAFE can significantly assist beginner to intermediate developers in understanding and addressing security vulnerabilities, thereby improving the overall usability of SAST tools.

Related papers

An Empirical Study on the Security Vulnerabilities of GPTs [48.12756684275687]
GPTs are one kind of customized AI agents based on OpenAI's large language models.<n>We present an empirical study on the security vulnerabilities of GPTs.
arXiv Detail & Related papers (2025-11-28T13:30:25Z)
From Model to Breach: Towards Actionable LLM-Generated Vulnerabilities Reporting [43.57360781012506]
We show that even the latest open-weight models are vulnerable in the earliest reported vulnerability scenarios.<n>We introduce a new severity metric that reflects the risk posed by an LLM-generated vulnerability.<n>To encourage the mitigation of the most serious and prevalent vulnerabilities, we use PE to define the Model Exposure (ME) score.
arXiv Detail & Related papers (2025-11-06T16:52:27Z)
In-Context Experience Replay Facilitates Safety Red-Teaming of Text-to-Image Diffusion Models [104.94706600050557]
Text-to-image (T2I) models have shown remarkable progress, but their potential to generate harmful content remains a critical concern in the ML community.<n>We propose ICER, a novel red-teaming framework that generates interpretable and semantic meaningful problematic prompts.<n>Our work provides crucial insights for developing more robust safety mechanisms in T2I systems.
arXiv Detail & Related papers (2024-11-25T04:17:24Z)
A Comprehensive Study on Static Application Security Testing (SAST) Tools for Android [22.558610938860124]
VulsTotal is a unified evaluation platform for defining and describing tools' supported vulnerability types. We select 11 free and open-sourced SAST tools from a pool of 97 existing options, adhering to clearly defined criteria. We then unify 67 general/common vulnerability types for Android SAST tools.
arXiv Detail & Related papers (2024-10-28T05:10:22Z)
The Impact of SBOM Generators on Vulnerability Assessment in Python: A Comparison and a Novel Approach [56.4040698609393]
Software Bill of Materials (SBOM) has been promoted as a tool to increase transparency and verifiability in software composition. Current SBOM generation tools often suffer from inaccuracies in identifying components and dependencies. We propose PIP-sbom, a novel pip-inspired solution that addresses their shortcomings.
arXiv Detail & Related papers (2024-09-10T10:12:37Z)
Comparison of Static Application Security Testing Tools and Large Language Models for Repo-level Vulnerability Detection [11.13802281700894]
Static Application Security Testing (SAST) is usually utilized to scan source code for security vulnerabilities. Deep learning (DL)-based methods have demonstrated their potential in software vulnerability detection. This paper compares 15 diverse SAST tools with 12 popular or state-of-the-art open-source LLMs in detecting software vulnerabilities.
arXiv Detail & Related papers (2024-07-23T07:21:14Z)
Towards Explainable Vulnerability Detection with Large Language Models [14.243344783348398]
Software vulnerabilities pose significant risks to the security and integrity of software systems.<n>The advent of large language models (LLMs) has introduced transformative potential due to their advanced generative capabilities.<n>In this paper, we propose LLMVulExp, an automated framework designed to specialize LLMs for the dual tasks of vulnerability detection and explanation.
arXiv Detail & Related papers (2024-06-14T04:01:25Z)
CodeAttack: Revealing Safety Generalization Challenges of Large Language Models via Code Completion [117.178835165855]
This paper introduces CodeAttack, a framework that transforms natural language inputs into code inputs. Our studies reveal a new and universal safety vulnerability of these models against code input. We find that a larger distribution gap between CodeAttack and natural language leads to weaker safety generalization.
arXiv Detail & Related papers (2024-03-12T17:55:38Z)
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages [45.16862486631841]
Tool learning is widely acknowledged as a foundational approach or deploying large language models (LLMs) in real-world scenarios. To fill this gap, we present *ToolSword*, a comprehensive framework dedicated to investigating safety issues linked to LLMs in tool learning.
arXiv Detail & Related papers (2024-02-16T15:19:46Z)
HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion [3.847218857469107]
We presentHW-V2W-Map Framework, which is a Machine Learning (ML) framework focusing on hardware vulnerabilities and Internet of Things (IoT) security. The architecture that we have proposed incorporates an Ontology-driven Storytelling framework, which automates the process of updating the Ontology. Our proposed framework utilized Generative Pre-trained Transformer (GPT) Large Language Models (LLMs) to provide mitigation suggestions.
arXiv Detail & Related papers (2023-12-21T02:14:41Z)
How Far Have We Gone in Vulnerability Detection Using Large Language Models [15.09461331135668]
We introduce a comprehensive vulnerability benchmark VulBench. This benchmark aggregates high-quality data from a wide range of CTF challenges and real-world applications. We find that several LLMs outperform traditional deep learning approaches in vulnerability detection.
arXiv Detail & Related papers (2023-11-21T08:20:39Z)
"False negative -- that one is going to kill you": Understanding Industry Perspectives of Static Analysis based Security Testing [15.403953373155508]
This paper describes a qualitative study that explores the assumptions, expectations, beliefs, and challenges experienced by developers who use SASTs. We perform in-depth, semi-structured interviews with 20 practitioners who possess a diverse range of software development expertise. We identify $17$ key findings that shed light on developer perceptions and desires related to SASTs.
arXiv Detail & Related papers (2023-07-30T21:27:41Z)

This list is automatically generated from the titles and abstracts of the papers in this site.