GraphEye: A Novel Solution for Detecting Vulnerable Functions Based on
Graph Attention Network
- URL: http://arxiv.org/abs/2202.02501v1
- Date: Sat, 5 Feb 2022 07:03:15 GMT
- Title: GraphEye: A Novel Solution for Detecting Vulnerable Functions Based on
Graph Attention Network
- Authors: Li Zhou, Minhuan Huang, Yujun Li, Yuanping Nie, Jin Li, Yiwei Liu
- Abstract summary: We propose a novel solution named GraphEye to identify whether a function of C/C++ code has vulnerabilities.
VecCPG is a vectorization for the code property graph, which is proposed to characterize the key syntax and semantic features of the corresponding source code.
GcGAT is a deep learning model based on the graph attention graph, which is proposed to solve the graph classification problem.
- Score: 8.420666984519826
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the continuous extension of the Industrial Internet, cyber incidents
caused by software vulnerabilities have been increasing in recent years.
However, software vulnerabilities detection is still heavily relying on code
review done by experts, and how to automatedly detect software vulnerabilities
is an open problem so far. In this paper, we propose a novel solution named
GraphEye to identify whether a function of C/C++ code has vulnerabilities,
which can greatly alleviate the burden of code auditors. GraphEye is originated
from the observation that the code property graph of a non-vulnerable function
naturally differs from the code property graph of a vulnerable function with
the same functionality. Hence, detecting vulnerable functions is attributed to
the graph classification problem.GraphEye is comprised of VecCPG and GcGAT.
VecCPG is a vectorization for the code property graph, which is proposed to
characterize the key syntax and semantic features of the corresponding source
code. GcGAT is a deep learning model based on the graph attention graph, which
is proposed to solve the graph classification problem according to VecCPG.
Finally, GraphEye is verified by the SARD Stack-based Buffer Overflow,
Divide-Zero, Null Pointer Deference, Buffer Error, and Resource Error datasets,
the corresponding F1 scores are 95.6%, 95.6%,96.1%,92.6%, and 96.1%
respectively, which validate the effectiveness of the proposed solution.
Related papers
- Scalable Defect Detection via Traversal on Code Graph [10.860910384163892]
We introduce QVoG, a graph-based static analysis platform for detecting defects and vulnerabilities.
It employs a compressed CPG representation to maintain a reasonable graph size, thereby enhancing the overall query efficiency.
For projects consisting of 1,000,000+ lines of code, QVoG can complete analysis in approximately 15 minutes, as opposed to 19 minutes with CodeQL.
arXiv Detail & Related papers (2024-06-12T11:24:52Z) - Towards Better Graph Neural Network-based Fault Localization Through Enhanced Code Representation [8.647406441990396]
We propose a new graph representation, DepGraph, that reduces the complexity of the graph representation by 70% in nodes and edges.
We evaluate DepGraph using Defects4j 2.0.0, and it outperforms Grace by locating 20% more faults in Top-1 and improving the Mean First Rank (MFR) and the Mean Average Rank (MAR) by over 50%.
arXiv Detail & Related papers (2024-04-06T04:13:01Z) - Towards Self-Interpretable Graph-Level Anomaly Detection [73.1152604947837]
Graph-level anomaly detection (GLAD) aims to identify graphs that exhibit notable dissimilarity compared to the majority in a collection.
We propose a Self-Interpretable Graph aNomaly dETection model ( SIGNET) that detects anomalous graphs as well as generates informative explanations simultaneously.
arXiv Detail & Related papers (2023-10-25T10:10:07Z) - A Graph Encoder-Decoder Network for Unsupervised Anomaly Detection [7.070726553564701]
We propose an unsupervised graph encoder-decoder model to detect abnormal nodes from graphs.
In the encoding stage, we design a novel pooling mechanism, named LCPool, to find a cluster assignment matrix.
In the decoding stage, we propose an unpooling operation, called LCUnpool, to reconstruct both the structure and nodal features of the original graph.
arXiv Detail & Related papers (2023-08-15T13:49:12Z) - DSHGT: Dual-Supervisors Heterogeneous Graph Transformer -- A pioneer study of using heterogeneous graph learning for detecting software vulnerabilities [12.460745260973837]
Vulnerability detection is a critical problem in software security and attracts growing attention both from academia and industry.
Recent advances in deep learning, especially Graph Neural Networks (GNN), have uncovered the feasibility of automatic detection of a wide range of software vulnerabilities.
In this work, we are one of the first to explore heterogeneous graph representation in the form of Code Property Graph.
arXiv Detail & Related papers (2023-06-02T08:57:13Z) - Sequential Graph Neural Networks for Source Code Vulnerability
Identification [5.582101184758527]
We present a properly curated C/C++ source code vulnerability dataset to aid in developing models.
We also propose a learning framework based on graph neural networks, denoted SEquential Graph Neural Network (SEGNN) for learning a large number of code semantic representations.
Our evaluations on two datasets and four baseline methods in a graph classification setting demonstrate state-of-the-art results.
arXiv Detail & Related papers (2023-05-23T17:25:51Z) - Source Free Unsupervised Graph Domain Adaptation [60.901775859601685]
Unsupervised Graph Domain Adaptation (UGDA) shows its practical value of reducing the labeling cost for node classification.
Most existing UGDA methods heavily rely on the labeled graph in the source domain.
In some real-world scenarios, the source graph is inaccessible because of privacy issues.
We propose a novel scenario named Source Free Unsupervised Graph Domain Adaptation (SFUGDA)
arXiv Detail & Related papers (2021-12-02T03:18:18Z) - Deep Fraud Detection on Non-attributed Graph [61.636677596161235]
Graph Neural Networks (GNNs) have shown solid performance on fraud detection.
labeled data is scarce in large-scale industrial problems, especially for fraud detection.
We propose a novel graph pre-training strategy to leverage more unlabeled data.
arXiv Detail & Related papers (2021-10-04T03:42:09Z) - Software Vulnerability Detection via Deep Learning over Disaggregated
Code Graph Representation [57.92972327649165]
This work explores a deep learning approach to automatically learn the insecure patterns from code corpora.
Because code naturally admits graph structures with parsing, we develop a novel graph neural network (GNN) to exploit both the semantic context and structural regularity of a program.
arXiv Detail & Related papers (2021-09-07T21:24:36Z) - Learning to map source code to software vulnerability using
code-as-a-graph [67.62847721118142]
We explore the applicability of Graph Neural Networks in learning the nuances of source code from a security perspective.
We show that a code-as-graph encoding is more meaningful for vulnerability detection than existing code-as-photo and linear sequence encoding approaches.
arXiv Detail & Related papers (2020-06-15T16:05:27Z) - Alleviating the Inconsistency Problem of Applying Graph Neural Network
to Fraud Detection [78.88163190021798]
We introduce a new GNN framework, $mathsfGraphConsis$, to tackle the inconsistency problem.
Empirical analysis on four datasets indicates the inconsistency problem is crucial in a fraud detection task.
We also released a GNN-based fraud detection toolbox with implementations of SOTA models.
arXiv Detail & Related papers (2020-05-01T21:43:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.