A Code Knowledge Graph-Enhanced System for LLM-Based Fuzz Driver Generation
- URL: http://arxiv.org/abs/2411.11532v1
- Date: Mon, 18 Nov 2024 12:41:16 GMT
- Title: A Code Knowledge Graph-Enhanced System for LLM-Based Fuzz Driver Generation
- Authors: Hanxiang Xu, Wei Ma, Ting Zhou, Yanjie Zhao, Kai Chen, Qiang Hu, Yang Liu, Haoyu Wang,
- Abstract summary: We propose CodeGraphGPT, a novel system that integrates code knowledge graphs with an intelligent agent to automate the fuzz driver generation process.
By framing fuzz driver creation as a code generation task, CodeGraphGPT leverages program analysis to construct a knowledge graph of code repositories.
We evaluate CodeGraphGPT on eight open-source software projects, achieving an average improvement of 8.73% in code coverage compared to state-of-the-art methods.
- Score: 29.490817477791357
- License:
- Abstract: The rapid development of large language models (LLMs) with advanced programming capabilities has paved the way for innovative approaches in software testing. Fuzz testing, a cornerstone for improving software reliability and detecting vulnerabilities, often relies on manually written fuzz drivers, limiting scalability and efficiency. To address this challenge, we propose CodeGraphGPT, a novel system that integrates code knowledge graphs with an LLM-powered intelligent agent to automate the fuzz driver generation process. By framing fuzz driver creation as a code generation task, CodeGraphGPT leverages program analysis to construct a knowledge graph of code repositories, where nodes represent code entities, such as functions or files, and edges capture their relationships. This enables the system to generate tailored fuzz drivers and input seeds, resolve compilation errors, and analyze crash reports, all while adapting to specific API usage scenarios. Additionally, querying the knowledge graph helps identify precise testing targets and contextualize the purpose of each fuzz driver within the fuzzing loop. We evaluated CodeGraphGPT on eight open-source software projects, achieving an average improvement of 8.73\% in code coverage compared to state-of-the-art methods. Moreover, it reduced the manual workload in crash case analysis by 84.4\% and identified 11 real-world bugs, including nine previously unreported ones. This work highlights how integrating LLMs with code knowledge graphs enhances fuzz driver generation, offering an efficient solution for vulnerability detection and software quality improvement.
Related papers
- Your Fix Is My Exploit: Enabling Comprehensive DL Library API Fuzzing with Large Language Models [49.214291813478695]
Deep learning (DL) libraries, widely used in AI applications, often contain vulnerabilities like overflows and use buffer-free errors.
Traditional fuzzing struggles with the complexity and API diversity of DL libraries.
We propose DFUZZ, an LLM-driven fuzzing approach for DL libraries.
arXiv Detail & Related papers (2025-01-08T07:07:22Z) - FuzzDistill: Intelligent Fuzzing Target Selection using Compile-Time Analysis and Machine Learning [0.0]
I present FuzzDistill, an approach that harnesses compile-time data and machine learning to refine fuzzing targets.
I demonstrate the efficacy of my approach through experiments conducted on real-world software, demonstrating substantial reductions in testing time.
arXiv Detail & Related papers (2024-12-11T04:55:58Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - Prompt Fuzzing for Fuzz Driver Generation [6.238058387665971]
We propose PromptFuzz, a coverage-guided fuzzer for prompt fuzzing.
It iteratively generates fuzz drivers to explore undiscovered library code.
PromptFuzz achieved 1.61 and 1.63 times higher branch coverage than OSS-Fuzz and Hopper, respectively.
arXiv Detail & Related papers (2023-12-29T16:43:51Z) - Zero-Shot Detection of Machine-Generated Codes [83.0342513054389]
This work proposes a training-free approach for the detection of LLMs-generated codes.
We find that existing training-based or zero-shot text detectors are ineffective in detecting code.
Our method exhibits robustness against revision attacks and generalizes well to Java codes.
arXiv Detail & Related papers (2023-10-08T10:08:21Z) - HOPPER: Interpretative Fuzzing for Libraries [6.36596812288503]
HOPPER can fuzz libraries without requiring any domain knowledge.
It transforms the problem of library fuzzing into the problem of interpreter fuzzing.
arXiv Detail & Related papers (2023-09-07T06:11:18Z) - How Effective Are They? Exploring Large Language Model Based Fuzz Driver Generation [31.77886516971502]
This study is the first in-depth study targeting the important issues of using LLMs to generate effective fuzz drivers.
Our study evaluated 736,430 generated fuzz drivers, with 0.85 billion token costs ($8,000+ charged tokens)
Our insights have been implemented to improve the OSS-Fuzz-Gen project, facilitating practical fuzz driver generation in industry.
arXiv Detail & Related papers (2023-07-24T01:49:05Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - DAE : Discriminatory Auto-Encoder for multivariate time-series anomaly
detection in air transportation [68.8204255655161]
We propose a novel anomaly detection model called Discriminatory Auto-Encoder (DAE)
It uses the baseline of a regular LSTM-based auto-encoder but with several decoders, each getting data of a specific flight phase.
Results show that the DAE achieves better results in both accuracy and speed of detection.
arXiv Detail & Related papers (2021-09-08T14:07:55Z) - Break-It-Fix-It: Unsupervised Learning for Program Repair [90.55497679266442]
We propose a new training approach, Break-It-Fix-It (BIFI), which has two key ideas.
We use the critic to check a fixer's output on real bad inputs and add good (fixed) outputs to the training data.
Based on these ideas, we iteratively update the breaker and the fixer while using them in conjunction to generate more paired data.
BIFI outperforms existing methods, obtaining 90.5% repair accuracy on GitHub-Python and 71.7% on DeepFix.
arXiv Detail & Related papers (2021-06-11T20:31:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.