Keep It Simple: Towards Accurate Vulnerability Detection for Large Code Graphs
- URL: http://arxiv.org/abs/2412.10164v1
- Date: Fri, 13 Dec 2024 14:27:51 GMT
- Title: Keep It Simple: Towards Accurate Vulnerability Detection for Large Code Graphs
- Authors: Xin Peng, Shangwen Wang, Yihao Qin, Bo Lin, Liqian Chen, Xiaoguang Mao,
- Abstract summary: We propose a novel vulnerability detection method, ANGLE, which embodies the hierarchical graph refinement and context-aware graph representation learning.
Our method significantly outperforms several other baselines in terms of the accuracy and F1 score.
In large code graphs, ANGLE achieves an improvement in accuracy of 34.27%-161.93% compared to the state-of-the-art method, AMPLE.
- Score: 6.236203127696138
- License:
- Abstract: Software vulnerability detection is crucial for high-quality software development. Recently, some studies utilizing Graph Neural Networks (GNNs) to learn the graph representation of code in vulnerability detection tasks have achieved remarkable success. However, existing graph-based approaches mainly face two limitations that prevent them from generalizing well to large code graphs: (1) the interference of noise information in the code graph; (2) the difficulty in capturing long-distance dependencies within the graph. To mitigate these problems, we propose a novel vulnerability detection method, ANGLE, whose novelty mainly embodies the hierarchical graph refinement and context-aware graph representation learning. The former hierarchically filters redundant information in the code graph, thereby reducing the size of the graph, while the latter collaboratively employs the Graph Transformer and GNN to learn code graph representations from both the global and local perspectives, thus capturing long-distance dependencies. Extensive experiments demonstrate promising results on three widely used benchmark datasets: our method significantly outperforms several other baselines in terms of the accuracy and F1 score. Particularly, in large code graphs, ANGLE achieves an improvement in accuracy of 34.27%-161.93% compared to the state-of-the-art method, AMPLE. Such results demonstrate the effectiveness of ANGLE in vulnerability detection tasks.
Related papers
- Explainable Malware Detection through Integrated Graph Reduction and Learning Techniques [2.464148828287322]
Control Flow Graphs and Function Call Graphs have become pivotal in providing a detailed understanding of program execution.
These graph-based representations, when combined with Graph Neural Networks (GNN), have shown promise in developing high-performance malware detectors.
This paper addresses these issues by developing several graph reduction techniques to reduce graph size and applying the state-of-the-art GNNExplainer to enhance the interpretability of GNN outputs.
arXiv Detail & Related papers (2024-12-04T18:59:45Z) - Three Revisits to Node-Level Graph Anomaly Detection: Outliers, Message
Passing and Hyperbolic Neural Networks [9.708651460086916]
In this paper, we revisit datasets and approaches for unsupervised node-level graph anomaly detection tasks.
Firstly, we introduce outlier injection methods that create more diverse and graph-based anomalies in graph datasets.
Secondly, we compare methods employing message passing against those without, uncovering the unexpected decline in performance.
arXiv Detail & Related papers (2024-03-06T19:42:34Z) - BOURNE: Bootstrapped Self-supervised Learning Framework for Unified
Graph Anomaly Detection [50.26074811655596]
We propose a novel unified graph anomaly detection framework based on bootstrapped self-supervised learning (named BOURNE)
By swapping the context embeddings between nodes and edges, we enable the mutual detection of node and edge anomalies.
BOURNE can eliminate the need for negative sampling, thereby enhancing its efficiency in handling large graphs.
arXiv Detail & Related papers (2023-07-28T00:44:57Z) - DSHGT: Dual-Supervisors Heterogeneous Graph Transformer -- A pioneer study of using heterogeneous graph learning for detecting software vulnerabilities [12.460745260973837]
Vulnerability detection is a critical problem in software security and attracts growing attention both from academia and industry.
Recent advances in deep learning, especially Graph Neural Networks (GNN), have uncovered the feasibility of automatic detection of a wide range of software vulnerabilities.
In this work, we are one of the first to explore heterogeneous graph representation in the form of Code Property Graph.
arXiv Detail & Related papers (2023-06-02T08:57:13Z) - Learning Strong Graph Neural Networks with Weak Information [64.64996100343602]
We develop a principled approach to the problem of graph learning with weak information (GLWI)
We propose D$2$PT, a dual-channel GNN framework that performs long-range information propagation on the input graph with incomplete structure, but also on a global graph that encodes global semantic similarities.
arXiv Detail & Related papers (2023-05-29T04:51:09Z) - Benchmarking Node Outlier Detection on Graphs [90.29966986023403]
Graph outlier detection is an emerging but crucial machine learning task with numerous applications.
We present the first comprehensive unsupervised node outlier detection benchmark for graphs called UNOD.
arXiv Detail & Related papers (2022-06-21T01:46:38Z) - LSP : Acceleration and Regularization of Graph Neural Networks via
Locality Sensitive Pruning of Graphs [2.4250821950628234]
Graph Neural Networks (GNNs) have emerged as highly successful tools for graph-related tasks.
Large graphs often involve many redundant components that can be removed without compromising the performance.
We propose a systematic method called Locality-Sensitive Pruning (LSP) for graph pruning based on Locality-Sensitive Hashing.
arXiv Detail & Related papers (2021-11-10T14:12:28Z) - Deep Fraud Detection on Non-attributed Graph [61.636677596161235]
Graph Neural Networks (GNNs) have shown solid performance on fraud detection.
labeled data is scarce in large-scale industrial problems, especially for fraud detection.
We propose a novel graph pre-training strategy to leverage more unlabeled data.
arXiv Detail & Related papers (2021-10-04T03:42:09Z) - Model-Agnostic Graph Regularization for Few-Shot Learning [60.64531995451357]
We present a comprehensive study on graph embedded few-shot learning.
We introduce a graph regularization approach that allows a deeper understanding of the impact of incorporating graph information between labels.
Our approach improves the performance of strong base learners by up to 2% on Mini-ImageNet and 6.7% on ImageNet-FS.
arXiv Detail & Related papers (2021-02-14T05:28:13Z) - Graph Information Bottleneck for Subgraph Recognition [103.37499715761784]
We propose a framework of Graph Information Bottleneck (GIB) for the subgraph recognition problem in deep graph learning.
Under this framework, one can recognize the maximally informative yet compressive subgraph, named IB-subgraph.
We evaluate the properties of the IB-subgraph in three application scenarios: improvement of graph classification, graph interpretation and graph denoising.
arXiv Detail & Related papers (2020-10-12T09:32:20Z) - Understanding Coarsening for Embedding Large-Scale Graphs [3.6739949215165164]
Proper analysis of graphs with Machine Learning (ML) algorithms has the potential to yield far-reaching insights into many areas of research and industry.
The irregular structure of graph data constitutes an obstacle for running ML tasks on graphs.
We analyze the impact of the coarsening quality on the embedding performance both in terms of speed and accuracy.
arXiv Detail & Related papers (2020-09-10T15:06:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.