Generator-Based Fuzzers with Type-Based Targeted Mutation
- URL: http://arxiv.org/abs/2406.02034v2
- Date: Wed, 12 Jun 2024 07:32:41 GMT
- Title: Generator-Based Fuzzers with Type-Based Targeted Mutation
- Authors: Soha Hussein, Stephen McCamant, Mike Whalen,
- Abstract summary: In previous work, coverage-guided fuzzers used a mix of static analysis, taint analysis, and constraint-solving approaches to address this problem.
In this paper, we introduce a type-based mutation, along with constant string lookup, for Java GBF.
Results compared to a baseline GBF tool show an almost 20% average improvement in application coverage, and larger improvements when third-party code is included.
- Score: 1.4507298892594764
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: As with any fuzzer, directing Generator-Based Fuzzers (GBF) to reach particular code targets can increase the fuzzer's effectiveness. In previous work, coverage-guided fuzzers used a mix of static analysis, taint analysis, and constraint-solving approaches to address this problem. However, none of these techniques were particularly crafted for GBF where input generators are used to construct program inputs. The observation is that input generators carry information about the input structure that is naturally present through the typing composition of the program input. In this paper, we introduce a type-based mutation heuristic, along with constant string lookup, for Java GBF. Our key intuition is that if one can identify which sub-part (types) of the input will likely influence the branching decision, then focusing on mutating the choices of the generators constructing these types is likely to achieve the desired coverages. We used our technique to fuzz AWSLambda applications. Results compared to a baseline GBF tool show an almost 20\% average improvement in application coverage, and larger improvements when third-party code is included.
Related papers
- $\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding [64.00025564372095]
Large language models (LLMs) have shown remarkable capabilities in code generation.
The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass.
We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
arXiv Detail & Related papers (2024-09-09T02:07:41Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - Frequency-aware Feature Fusion for Dense Image Prediction [99.85757278772262]
We propose Frequency-Aware Feature Fusion (FreqFusion) for dense image prediction tasks.
FreqFusion integrates an Adaptive Low-Pass Filter (ALPF) generator, an offset generator, and an Adaptive High-Pass Filter (AHPF) generator.
Comprehensive visualization and quantitative analysis demonstrate that FreqFusion effectively improves feature consistency and sharpens object boundaries.
arXiv Detail & Related papers (2024-08-23T07:30:34Z) - Make out like a (Multi-Armed) Bandit: Improving the Odds of Fuzzer Seed Scheduling with T-Scheduler [8.447499888458633]
Fuzzing is a highly-scalable software testing technique that uncovers bugs in a target program by executing it with mutated inputs.
We propose T-Scheduler, a seed scheduler built on multi-armed bandit theory.
We evaluate T-Scheduler over 35 CPU-yr of fuzzing, comparing it to 11 state-of-the-art schedulers.
arXiv Detail & Related papers (2023-12-07T23:27:55Z) - Fuzzing with Quantitative and Adaptive Hot-Bytes Identification [6.442499249981947]
American fuzzy lop, a leading fuzzing tool, has demonstrated its powerful bug finding ability through a vast number of reported CVEs.
We propose an approach called toolwhich is designed based on the following principles.
Our evaluation results on 10 real-world programs and LAVA-M dataset show that toolachieves sustained increases in branch coverage and discovers more bugs than other fuzzers.
arXiv Detail & Related papers (2023-07-05T13:41:35Z) - Augmenting Greybox Fuzzing with Generative AI [0.0]
We propose ChatFuzz, a greybox fuzzer augmented by generative AI.
We conduct extensive experiments to explore the best practice for harvesting the power of generative LLM models.
Experiment results show that our approach improves the edge coverage by 12.77% over the SOTA greybox fuzzer.
arXiv Detail & Related papers (2023-06-11T21:44:47Z) - Random Feature Attention [69.4671822971207]
We propose RFA, a linear time and space attention that uses random feature methods to approximate the softmax function.
RFA can be used as a drop-in replacement for conventional softmax attention and offers a straightforward way of learning with recency bias through an optional gating mechanism.
Experiments on language modeling and machine translation demonstrate that RFA achieves similar or better performance compared to strong transformer baselines.
arXiv Detail & Related papers (2021-03-03T02:48:56Z) - Sampling-Decomposable Generative Adversarial Recommender [84.05894139540048]
We propose a Sampling-Decomposable Generative Adversarial Recommender (SD-GAR)
In the framework, the divergence between some generator and the optimum is compensated by self-normalized importance sampling.
We extensively evaluate the proposed algorithm with five real-world recommendation datasets.
arXiv Detail & Related papers (2020-11-02T13:19:10Z) - End-to-End Synthetic Data Generation for Domain Adaptation of Question
Answering Systems [34.927828428293864]
Our model comprises a single transformer-based encoder-decoder network that is trained end-to-end to generate both answers and questions.
In a nutshell, we feed a passage to the encoder and ask the decoder to generate a question and an answer token-by-token.
arXiv Detail & Related papers (2020-10-12T21:10:18Z) - Augmentation of the Reconstruction Performance of Fuzzy C-Means with an
Optimized Fuzzification Factor Vector [99.19847674810079]
Fuzzy C-Means (FCM) is one of the most frequently used methods to construct information granules.
In this paper, we augment the FCM-based degranulation mechanism by introducing a vector of fuzzification factors.
Experiments completed for both synthetic and publicly available datasets show that the proposed approach outperforms the generic data reconstruction approach.
arXiv Detail & Related papers (2020-04-13T04:17:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.