Related papers: Fuzzwise: Intelligent Initial Corpus Generation for Fuzzing

Fuzzwise: Intelligent Initial Corpus Generation for Fuzzing

URL: http://arxiv.org/abs/2512.21440v1
Date: Wed, 24 Dec 2025 22:17:29 GMT
Title: Fuzzwise: Intelligent Initial Corpus Generation for Fuzzing
Authors: Hridya Dhulipala, Xiaokai Rong, Aashish Yadavally, Tien N. Nguyen,
Abstract summary: In mutation-based greybox fuzzing, generating high-quality input seeds for the initial corpus is essential.<n>FuzzyWise integrates separate phases for generating a large corpus and subsequently minimizing it.<n>FuzzyWise achieves high code coverage and triggers more runtime errors compared to the baselines.
Score: 14.734454356396157
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In mutation-based greybox fuzzing, generating high-quality input seeds for the initial corpus is essential for effective fuzzing. Rather than conducting separate phases for generating a large corpus and subsequently minimizing it, we propose FuzzWise which integrates them into one process to generate the optimal initial corpus of seeds (ICS). FuzzWise leverages a multi-agent framework based on Large Language Models (LLMs). The first LLM agent generates test cases for the target program. The second LLM agent, which functions as a predictive code coverage module, assesses whether each generated test case will enhance the overall coverage of the current corpus. The streamlined process allows each newly generated test seed to be immediately evaluated for its contribution to the overall coverage. FuzzWise employs a predictive approach using an LLM and eliminates the need for actual execution, saving computational resources and time, particularly in scenarios where the execution is not desirable or even impossible. Our empirical evaluation demonstrates that FuzzWise generates significantly fewer test cases than baseline methods. Despite the lower number of test cases, FuzzWise achieves high code coverage and triggers more runtime errors compared to the baselines. Moreover, it is more time-efficient and coverage-efficient in producing an initial corpus catching more errors.

Related papers

MIST-RL: Mutation-based Incremental Suite Testing via Reinforcement Learning [19.054149750597933]
MIST-RL (Mutation-based Incremental Suite Testing via Reinforcement Learning) is a framework that shifts the focus to "scaling-by-utility"<n>We introduce a novel incremental mutation reward combined with dynamic penalties, which incentivizes the model to discover new faults while it suppresses functionally equivalent assertions.<n>Experiments on HumanEval+ and MBPP+ demonstrate that MIST-RL outperforms state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-02T03:22:44Z)
Test-Time Scaling with Diffusion Language Models via Reward-Guided Stitching [66.39914384073145]
We propose a self-consistency framework that turns cheap diffusion-sampled reasoning into a reusable pool of step-level candidates.<n>We find that step-level recombination is most beneficial on harder problems.<n>Our training-free framework improves average accuracy by up to 2 across six math and coding tasks.
arXiv Detail & Related papers (2026-02-26T11:08:39Z)
Prism: Efficient Test-Time Scaling via Hierarchical Search and Self-Verification for Discrete Diffusion Language Models [96.0074341403456]
Inference-time compute has re-emerged as a practical way to improve LLM reasoning.<n>Most test-time scaling (TTS) algorithms rely on autoregressive decoding.<n>We propose Prism, an efficient TTS framework for dLLMs.
arXiv Detail & Related papers (2026-02-02T09:14:51Z)
LLMs are All You Need? Improving Fuzz Testing for MOJO with Large Language Models [7.171282546185869]
Large language models (LLMs) have revolutionized software testing, particularly fuzz testing, by automating the generation of diverse and effective test inputs.<n>MoJO is a high-performance AI programming language blending Python's usability with the efficiency of C and C++.<n>MoJOFuzzer is the first adaptive LLM-based fuzzing framework designed for zero-shot learning environments of emerging programming languages.
arXiv Detail & Related papers (2025-10-11T11:37:18Z)
BASFuzz: Towards Robustness Evaluation of LLM-based NLP Software via Automated Fuzz Testing [8.893978269498524]
BASFuzz is an efficient fuzz testing method tailored for large language model (LLM)-based NLP software.<n>A Beam-Annealing Search algorithm, which integrates beam search and simulated annealing, is employed to design an efficient fuzzing loop.<n>Experiments demonstrate that BASFuzz achieves a testing effectiveness of 90.335% while reducing the average time overhead by 2,163.852 seconds.
arXiv Detail & Related papers (2025-09-22T03:13:57Z)
Fast Controlled Generation from Language Models with Adaptive Weighted Rejection Sampling [90.86991492288487]
evaluating constraint on every token can be prohibitively expensive.<n> LCD can distort the global distribution over strings, sampling tokens based only on local information.<n>We show that our approach is superior to state-of-the-art baselines.
arXiv Detail & Related papers (2025-04-07T18:30:18Z)
Provable Scaling Laws for the Test-Time Compute of Large Language Models [84.00141420901038]
We propose two algorithms that enjoy provable scaling laws for the test-time compute of large language models.<n>One is a two-stage knockout-style algorithm, where each candidate is evaluated by its average win rate against multiple opponents.<n>The other is a two-stage league-style algorithm, where each candidate is evaluated by its average win rate against multiple opponents.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
$\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding [64.00025564372095]
Large language models (LLMs) have shown remarkable capabilities in code generation. The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass. We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
arXiv Detail & Related papers (2024-09-09T02:07:41Z)
Nearest Neighbor Speculative Decoding for LLM Generation and Attribution [87.3259169631789]
Nearest Speculative Decoding (NEST) is capable of incorporating real-world text spans of arbitrary length into the LM generations and providing attribution to their sources.<n>NEST significantly enhances the generation quality and attribution rate of the base LM across a variety of knowledge-intensive tasks.<n>In addition, NEST substantially improves the generation speed, achieving a 1.8x speedup in inference time when applied to Llama-2-Chat 70B.
arXiv Detail & Related papers (2024-05-29T17:55:03Z)
Large Language Models as Test Case Generators: Performance Evaluation and Enhancement [3.5398126682962587]
We study how well Large Language Models can generate high-quality test cases. We propose a multi-agent framework called emphTestChain that decouples the generation of test inputs and test outputs. Our results indicate that TestChain outperforms the baseline by a large margin.
arXiv Detail & Related papers (2024-04-20T10:27:01Z)
EEL: Efficiently Encoding Lattices for Reranking [44.77383151122229]
We use Transformers to efficiently encode lattices of generated outputs. We combine this approach with a new class of token-factored rerankers (TFRs) Our results show both substantial speedup compared to naive reranking and often better performance on downstream metrics than comparable approaches.
arXiv Detail & Related papers (2023-06-01T17:45:32Z)
Adaptive Sampling for Best Policy Identification in Markov Decision Processes [79.4957965474334]
We investigate the problem of best-policy identification in discounted Markov Decision (MDPs) when the learner has access to a generative model. The advantages of state-of-the-art algorithms are discussed and illustrated.
arXiv Detail & Related papers (2020-09-28T15:22:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.