Fuzz Driver Synthesis for Rust Generic APIs
- URL: http://arxiv.org/abs/2312.10676v2
- Date: Tue, 19 Dec 2023 12:14:52 GMT
- Title: Fuzz Driver Synthesis for Rust Generic APIs
- Authors: Yehong Zhang, Jun Wu, Hui Xu
- Abstract summary: This paper studies the automated fuzz driver synthesis problem for Rust libraries with generic APIs.
By solving such dependencies and type constraints, we can generate a collection of candidate monomorphic APIs.
Experimental results with 29 popular open-source libraries show that our approach can achieve promising generic API coverage with a low rate of invalid fuzz drivers.
- Score: 9.34200641681839
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fuzzing is a popular bug detection technique achieved by testing software
executables with random inputs. This technique can also be extended to
libraries by constructing executables that call library APIs, known as fuzz
drivers. Automated fuzz driver synthesis has been an important research topic
in recent years since it can facilitate the library fuzzing process.
Nevertheless, existing approaches generally ignore generic APIs or simply treat
them as normal APIs. As a result, they cannot generate effective fuzz drivers
for generic APIs.
This paper studies the automated fuzz driver synthesis problem for Rust
libraries with generic APIs. The problem is essential because Rust emphasizes
security, and generic APIs are widely employed in Rust crates. Each generic API
can have numerous monomorphic versions as long as the type constraints are
satisfied. The critical challenge to this problem lies in prioritizing these
monomorphic versions and providing valid inputs for them. To address the
problem, we extend existing API-dependency graphs to support generic APIs. By
solving such dependencies and type constraints, we can generate a collection of
candidate monomorphic APIs. Further, we apply a similarity-based filter to
prune redundant versions, particularly if multiple monomorphic APIs adopt the
identical trait implementation. Experimental results with 29 popular
open-source libraries show that our approach can achieve promising generic API
coverage with a low rate of invalid fuzz drivers. Besides, we find 23 bugs
previously unknown in these libraries, with 18 bugs related to generic APIs.
Related papers
- A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment [49.00213183302225]
We propose a framework to induce new APIs by grounding wikiHow instruction to situated agent policies.
Inspired by recent successes in large language models (LLMs) for embodied planning, we propose a few-shot prompting to steer GPT-4.
arXiv Detail & Related papers (2024-07-10T15:52:44Z) - Exception-aware Lifecycle Model Construction for Framework APIs [4.333061751230906]
This paper adopts a static analysis technique to extract exception summary information in the framework API code.
It generates exception-aware API lifecycle models for the given framework/library project.
Compared to the exception-unaware API lifecycle modeling on 60 versions, JavaExp can identify 18% times more API changes.
arXiv Detail & Related papers (2024-01-05T06:35:47Z) - Prompt Fuzzing for Fuzz Driver Generation [6.238058387665971]
We propose PromptFuzz, a coverage-guided fuzzer for prompt fuzzing.
It iteratively generates fuzz drivers to explore undiscovered library code.
PromptFuzz achieved 1.61 and 1.63 times higher branch coverage than OSS-Fuzz and Hopper, respectively.
arXiv Detail & Related papers (2023-12-29T16:43:51Z) - Extended Paper: API-driven Program Synthesis for Testing Static Typing
Implementations [11.300829269111627]
We introduce a novel approach for testing static typing implementations based on the concept of API-driven program synthesis.
The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries.
arXiv Detail & Related papers (2023-11-08T08:32:40Z) - Advanced White-Box Heuristics for Search-Based Fuzzing of REST APIs [3.3714461095047743]
Currently, EvoMaster is the only existing tool that supports white-box fuzzing of REST APIs.
We provide a series of novel white-box fuzzs, including for example how to deal with under-specified constrains in API schemas.
Our novel techniques are implemented as an extension to our open-source, search-based fuzzer EvoMaster.
arXiv Detail & Related papers (2023-09-15T12:39:01Z) - HOPPER: Interpretative Fuzzing for Libraries [6.36596812288503]
HOPPER can fuzz libraries without requiring any domain knowledge.
It transforms the problem of library fuzzing into the problem of interpreter fuzzing.
arXiv Detail & Related papers (2023-09-07T06:11:18Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - torchgfn: A PyTorch GFlowNet library [56.071033896777784]
torchgfn is a PyTorch library that aims to address this need.
It provides users with a simple API for environments and useful abstractions for samplers and losses.
arXiv Detail & Related papers (2023-05-24T00:20:59Z) - Binding Language Models in Symbolic Languages [146.3027328556881]
Binder is a training-free neural-symbolic framework that maps the task input to a program.
In the parsing stage, Codex is able to identify the part of the task input that cannot be answerable by the original programming language.
In the execution stage, Codex can perform versatile functionalities given proper prompts in the API calls.
arXiv Detail & Related papers (2022-10-06T12:55:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.