Related papers: Learning to Locate: GNN-Powered Vulnerability Path Discovery in Open Source Code

Learning to Locate: GNN-Powered Vulnerability Path Discovery in Open Source Code

URL: http://arxiv.org/abs/2507.17888v1
Date: Wed, 23 Jul 2025 19:30:37 GMT
Title: Learning to Locate: GNN-Powered Vulnerability Path Discovery in Open Source Code
Authors: Nima Atashin, Behrouz Tork Ladani, Mohammadreza Sharbaf,
Abstract summary: VulPathFinder is an explainable vulnerability path discovery framework.<n>It enhances SliceLocator's methodology by utilizing a novel Graph Neural Network (GNN) model.
Score: 0.9339914898177187
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Detecting security vulnerabilities in open-source software is a critical task that is highly regarded in the related research communities. Several approaches have been proposed in the literature for detecting vulnerable codes and identifying the classes of vulnerabilities. However, there is still room to work in explaining the root causes of detected vulnerabilities through locating vulnerable statements and the discovery of paths leading to the activation of the vulnerability. While frameworks like SliceLocator offer explanations by identifying vulnerable paths, they rely on rule-based sink identification that limits their generalization. In this paper, we introduce VulPathFinder, an explainable vulnerability path discovery framework that enhances SliceLocator's methodology by utilizing a novel Graph Neural Network (GNN) model for detecting sink statements, rather than relying on predefined rules. The proposed GNN captures semantic and syntactic dependencies to find potential sink points (PSPs), which are candidate statements where vulnerable paths end. After detecting PSPs, program slicing can be used to extract potentially vulnerable paths, which are then ranked by feeding them back into the target graph-based detector. Ultimately, the most probable path is returned, explaining the root cause of the detected vulnerability. We demonstrated the effectiveness of the proposed approach by performing evaluations on a benchmark of the buffer overflow CWEs from the SARD dataset, providing explanations for the corresponding detected vulnerabilities. The results show that VulPathFinder outperforms both original SliceLocator and GNNExplainer (as a general GNN explainability tool) in discovery of vulnerability paths to identified PSPs.

Related papers

Defending against Indirect Prompt Injection by Instruction Detection [81.98614607987793]
We propose a novel approach that takes external data as input and leverages the behavioral state of LLMs during both forward and backward propagation to detect potential IPI attacks.<n>Our approach achieves a detection accuracy of 99.60% in the in-domain setting and 96.90% in the out-of-domain setting, while reducing the attack success rate to just 0.12% on the BIPIA benchmark.
arXiv Detail & Related papers (2025-05-08T13:04:45Z)
Vul-RAG: Enhancing LLM-based Vulnerability Detection via Knowledge-level RAG [19.38891892396794]
Vulnerability knowledge generated by Vul-RAG can serve as high-quality explanations to improve manual detection accuracy.<n>Vul-RAG can also detect 10 previously-unknown bugs in the recent Linux kernel release with 6 assigned CVEs.
arXiv Detail & Related papers (2024-06-17T02:25:45Z)
Graph Neural Networks for Vulnerability Detection: A Counterfactual Explanation [41.831831628421675]
Graph Neural Networks (GNNs) have emerged as a prominent code embedding approach for vulnerability detection. We propose CFExplainer, a novel counterfactual explainer for GNN-based vulnerability detection.
arXiv Detail & Related papers (2024-04-24T06:52:53Z)
Enhancing Code Vulnerability Detection via Vulnerability-Preserving Data Augmentation [29.72520866016839]
Source code vulnerability detection aims to identify inherent vulnerabilities to safeguard software systems from potential attacks. Many prior studies overlook diverse vulnerability characteristics, simplifying the problem into a binary (0-1) classification task. FGVulDet employs multiple classifiers to discern characteristics of various vulnerability types and combines their outputs to identify the specific type of vulnerability. FGVulDet is trained on a large-scale dataset from GitHub, encompassing five different types of vulnerabilities.
arXiv Detail & Related papers (2024-04-15T09:10:52Z)
SliceLocator: Locating Vulnerable Statements with Graph-based Detectors [33.395068754566935]
SliceLocator identifies the most relevant taint flow by selecting the highest-weighted flow path from all potential vulnerability-triggering statements.<n>We demonstrate that SliceLocator consistently performs well on four state-of-the-art GNN-based vulnerability detectors.
arXiv Detail & Related papers (2024-01-05T10:15:04Z)
DCDetector: An IoT terminal vulnerability mining system based on distributed deep ensemble learning under source code representation [2.561778620560749]
The goal of the research is to intelligently detect vulnerabilities in source codes of high-level languages such as C/C++. This enables us to propose a code representation of sensitive sentence-related slices of source code, and to detect vulnerabilities by designing a distributed deep ensemble learning model. Experiments show that this method can reduce the false positive rate of traditional static analysis and improve the performance and accuracy of machine learning.
arXiv Detail & Related papers (2022-11-29T14:19:14Z)
A Hierarchical Deep Neural Network for Detecting Lines of Codes with Vulnerabilities [6.09170287691728]
Software vulnerabilities, caused by unintentional flaws in source codes, are the main root cause of cyberattacks. We propose a deep learning approach to detect vulnerabilities from their LLVM IR representations based on the techniques that have been used in natural language processing.
arXiv Detail & Related papers (2022-11-15T21:21:27Z)
An anomaly detection approach for backdoored neural networks: face recognition as a case study [77.92020418343022]
We propose a novel backdoored network detection method based on the principle of anomaly detection. We test our method on a novel dataset of backdoored networks and report detectability results with perfect scores.
arXiv Detail & Related papers (2022-08-22T12:14:13Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
Information Obfuscation of Graph Neural Networks [96.8421624921384]
We study the problem of protecting sensitive attributes by information obfuscation when learning with graph structured data. We propose a framework to locally filter out pre-determined sensitive attributes via adversarial training with the total variation and the Wasserstein distance.
arXiv Detail & Related papers (2020-09-28T17:55:04Z)
Graph Backdoor [53.70971502299977]
We present GTA, the first backdoor attack on graph neural networks (GNNs) GTA departs in significant ways: it defines triggers as specific subgraphs, including both topological structures and descriptive features. It can be instantiated for both transductive (e.g., node classification) and inductive (e.g., graph classification) tasks.
arXiv Detail & Related papers (2020-06-21T19:45:30Z)
$\mu$VulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection [24.98991662345816]
We propose the first deep learning-based system for multiclass vulnerability detection, dubbed $mu$VulDeePecker. The key insight underlying $mu$VulDeePecker is the concept of code attention, which can capture information that can help pinpoint types of vulnerabilities. Experiments show that $mu$VulDeePecker is effective for multiclass vulnerability detection and that accommodating control-dependence can lead to higher detection capabilities.
arXiv Detail & Related papers (2020-01-08T01:47:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.