Related papers: Sequential Graph Neural Networks for Source Code Vulnerability Identification

Sequential Graph Neural Networks for Source Code Vulnerability Identification

URL: http://arxiv.org/abs/2306.05375v1
Date: Tue, 23 May 2023 17:25:51 GMT
Title: Sequential Graph Neural Networks for Source Code Vulnerability Identification
Authors: Ammar Ahmed, Anwar Said, Mudassir Shabbir, Xenofon Koutsoukos
Abstract summary: We present a properly curated C/C++ source code vulnerability dataset to aid in developing models. We also propose a learning framework based on graph neural networks, denoted SEquential Graph Neural Network (SEGNN) for learning a large number of code semantic representations. Our evaluations on two datasets and four baseline methods in a graph classification setting demonstrate state-of-the-art results.
Score: 5.582101184758527
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Vulnerability identification constitutes a task of high importance for cyber security. It is quite helpful for locating and fixing vulnerable functions in large applications. However, this task is rather challenging owing to the absence of reliable and adequately managed datasets and learning models. Existing solutions typically rely on human expertise to annotate datasets or specify features, which is prone to error. In addition, the learning models have a high rate of false positives. To bridge this gap, in this paper, we present a properly curated C/C++ source code vulnerability dataset, denoted as CVEFunctionGraphEmbeddings (CVEFGE), to aid in developing models. CVEFGE is automatically crawled from the CVE database, which contains authentic and publicly disclosed source code vulnerabilities. We also propose a learning framework based on graph neural networks, denoted SEquential Graph Neural Network (SEGNN) for learning a large number of code semantic representations. SEGNN consists of a sequential learning module, graph convolution, pooling, and fully connected layers. Our evaluations on two datasets and four baseline methods in a graph classification setting demonstrate state-of-the-art results.

Related papers

Enhancing Software Vulnerability Detection Using Code Property Graphs and Convolutional Neural Networks [0.0]
This paper proposes a novel approach to detecting software vulnerabilities using a combination of code property graphs and machine learning techniques. We introduce various neural network models, including convolutional neural networks adapted for graph data, to process these representations. Our contributions include a methodology for transforming software code into code property graphs, the implementation of a convolutional neural network model for graph data, and the creation of a comprehensive dataset for training and evaluation.
arXiv Detail & Related papers (2025-03-23T19:12:07Z)
Detecting Code Vulnerabilities with Heterogeneous GNN Training [3.1333320740278627]
Graph Neural Network (GNN) machine learning can be a promising approach by modeling source code as graphs. This paper presents Inter-Procedural Abstract Graphs (IPAGs) as an efficient, language-agnostic representation of source code. We also propose a Heterogeneous Attention GNN (HAGNN) model that incorporates multiple subgraphs capturing different features of source code.
arXiv Detail & Related papers (2025-02-24T04:39:16Z)
Source Code Vulnerability Detection: Combining Code Language Models and Code Property Graphs [5.953617559607503]
Vul-LMGNN is a unified model that combines pre-trained code language models with code property graphs. Vul-LMGNN constructs a code property graph that integrates various code attributes into a unified graph structure. To effectively retain dependency information among various attributes, we introduce a gated code Graph Neural Network.
arXiv Detail & Related papers (2024-04-23T03:48:18Z)
Applying Self-supervised Learning to Network Intrusion Detection for Network Flows with Graph Neural Network [8.318363497010969]
This paper studies the application of GNNs to identify the specific types of network flows in an unsupervised manner. To the best of our knowledge, it is the first GNN-based self-supervised method for the multiclass classification of network flows in NIDS.
arXiv Detail & Related papers (2024-03-03T12:34:13Z)
Loss-aware Curriculum Learning for Heterogeneous Graph Neural Networks [30.333265803394998]
This paper investigates the application of curriculum learning techniques to improve the performance of Heterogeneous Graph Neural Networks (GNNs) To better classify the quality of the data, we design a loss-aware training schedule, named LTS, that measures the quality of every nodes of the data. Our findings demonstrate the efficacy of curriculum learning in enhancing HGNNs capabilities for analyzing complex graph-structured data.
arXiv Detail & Related papers (2024-02-29T05:44:41Z)
Redundancy-Free Self-Supervised Relational Learning for Graph Clustering [13.176413653235311]
We propose a novel self-supervised deep graph clustering method named Redundancy-Free Graph Clustering (R$2$FGC) It extracts the attribute- and structure-level relational information from both global and local views based on an autoencoder and a graph autoencoder. Our experiments are performed on widely used benchmark datasets to validate the superiority of our R$2$FGC over state-of-the-art baselines.
arXiv Detail & Related papers (2023-09-09T06:18:50Z)
Learning Strong Graph Neural Networks with Weak Information [64.64996100343602]
We develop a principled approach to the problem of graph learning with weak information (GLWI) We propose D$2$PT, a dual-channel GNN framework that performs long-range information propagation on the input graph with incomplete structure, but also on a global graph that encodes global semantic similarities.
arXiv Detail & Related papers (2023-05-29T04:51:09Z)
A Robust Stacking Framework for Training Deep Graph Models with Multifaceted Node Features [61.92791503017341]
Graph Neural Networks (GNNs) with numerical node features and graph structure as inputs have demonstrated superior performance on various supervised learning tasks with graph data. The best models for such data types in most standard supervised learning settings with IID (non-graph) data are not easily incorporated into a GNN. Here we propose a robust stacking framework that fuses graph-aware propagation with arbitrary models intended for IID data.
arXiv Detail & Related papers (2022-06-16T22:46:33Z)
VELVET: a noVel Ensemble Learning approach to automatically locate VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code. Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph. VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
ReGVD: Revisiting Graph Neural Networks for Vulnerability Detection [20.65271290295621]
We propose ReGVD, a graph network-based model for vulnerability detection. In particular, ReGVD views a given source code as a flat sequence of tokens. We obtain the highest accuracy on the real-world benchmark dataset from CodeXGLUE for vulnerability detection.
arXiv Detail & Related papers (2021-10-14T12:44:38Z)
Software Vulnerability Detection via Deep Learning over Disaggregated Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora. Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z)
An Introduction to Robust Graph Convolutional Networks [71.68610791161355]
We propose a novel Robust Graph Convolutional Neural Networks for possible erroneous single-view or multi-view data. By incorporating an extra layers via Autoencoders into traditional graph convolutional networks, we characterize and handle typical error models explicitly.
arXiv Detail & Related papers (2021-03-27T04:47:59Z)
Learning to map source code to software vulnerability using code-as-a-graph [67.62847721118142]
We explore the applicability of Graph Neural Networks in learning the nuances of source code from a security perspective. We show that a code-as-graph encoding is more meaningful for vulnerability detection than existing code-as-photo and linear sequence encoding approaches.
arXiv Detail & Related papers (2020-06-15T16:05:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.