Related papers: Large-Scale Empirical Analysis of Continuous Fuzzing: Insights from 1 Million Fuzzing Sessions

Large-Scale Empirical Analysis of Continuous Fuzzing: Insights from 1 Million Fuzzing Sessions

URL: http://arxiv.org/abs/2510.16433v1
Date: Sat, 18 Oct 2025 10:13:19 GMT
Title: Large-Scale Empirical Analysis of Continuous Fuzzing: Insights from 1 Million Fuzzing Sessions
Authors: Tatsuya Shirai, Olivier Nourry, Yutaro Kashiwa, Kenji Fujiwara, Yasutaka Kamei, Hajimu Iida,
Abstract summary: This study aims to elucidate the role of continuous fuzzing in vulnerability detection.<n>We collect issue reports, coverage reports, and fuzzing logs from OSS-Fuzz, an online service provided by Google.<n>We reveal that a substantial number of fuzzing bugs exist prior to the integration of continuous fuzzing.
Score: 0.07183613290627339
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: Software vulnerabilities are constantly being reported and exploited in software products, causing significant impacts on society. In recent years, the main approach to vulnerability detection, fuzzing, has been integrated into the continuous integration process to run in short and frequent cycles. This continuous fuzzing allows for fast identification and remediation of vulnerabilities during the development process. Despite adoption by thousands of projects, however, it is unclear how continuous fuzzing contributes to vulnerability detection. This study aims to elucidate the role of continuous fuzzing in vulnerability detection. Specifically, we investigate the coverage and the total number of fuzzing sessions when fuzzing bugs are discovered. We collect issue reports, coverage reports, and fuzzing logs from OSS-Fuzz, an online service provided by Google that performs fuzzing during continuous integration. Through an empirical study of a total of approximately 1.12 million fuzzing sessions from 878 projects participating in OSS-Fuzz, we reveal that (i) a substantial number of fuzzing bugs exist prior to the integration of continuous fuzzing, leading to a high detection rate in the early stages; (ii) code coverage continues to increase as continuous fuzzing progresses; and (iii) changes in coverage contribute to the detection of fuzzing bugs. This study provides empirical insights into how continuous fuzzing contributes to fuzzing bug detection, offering practical implications for future strategies and tool development in continuous fuzzing.

Related papers

When Is Enough Not Enough? Illusory Completion in Search Agents [56.98225130959051]
We study whether search agents reliably reason across all requirements by tracking, verifying, and maintaining multiple conditions.<n>We find that illusory completion frequently occurs, wherein agents believe tasks are complete despite unresolved or violated constraints, leading to underverified answers.<n>We examine whether explicit constraint-state tracking during execution mitigates these failures via LiveLedger, an inference-time tracker.
arXiv Detail & Related papers (2026-02-07T13:50:38Z)
Does Programming Language Matter? An Empirical Study of Fuzzing Bug Detection [0.0761187029699472]
This study conducts a large-scale cross-language analysis to examine how fuzzing bug characteristics and detection efficiency differ among languages.<n>We analyze 61,444 fuzzing bugs and 999,248 builds from 559 OSS-Fuzz projects categorized by primary language.<n>Our findings reveal that (i) C++ and Rust exhibit higher fuzzing bug detection frequencies, (ii) Rust and Python show low vulnerability ratios but tend to expose more critical vulnerabilities, (iii) crash types vary across languages and unreproducible bugs are more frequent in Go but rare in Rust, and (iv) Python attains higher patch coverage but suffers
arXiv Detail & Related papers (2026-02-05T05:08:51Z)
Enhancing Code Review through Fuzzing and Likely Invariants [13.727241655311664]
We present FuzzSight, a framework that leverages likely invariants from non-crashing fuzzing inputs to highlight behavioral differences across program versions.<n>In our evaluation, FuzzSight flagged 75% of regression bugs and up to 80% of vulnerabilities uncovered by 24-hour fuzzing.
arXiv Detail & Related papers (2025-10-17T10:30:22Z)
InsightQL: Advancing Human-Assisted Fuzzing with a Unified Code Database and Parameterized Query Interface [8.846926306547646]
InsightQL is the first human-assisting framework for fuzz blocker analysis.<n>Powered by a unified database and an intuitive parameterized query interface, InsightQL aids developers in systematically extracting insights.<n>Our experiments on 14 popular real-world libraries from the FuzzBench benchmark demonstrate the effectiveness of InsightQL.
arXiv Detail & Related papers (2025-10-06T14:18:35Z)
What Do They Fix? LLM-Aided Categorization of Security Patches for Critical Memory Bugs [46.325755802511026]
We developLM, a dual-method pipeline that integrates two approaches based on a Large Language Model (LLM) and a fine-tuned small language model.<n>LM successfully identified 111 of 5,140 recent Linux kernel patches addressing OOB or UAF vulnerabilities, with 90 true positives confirmed by manual verification.
arXiv Detail & Related papers (2025-09-26T18:06:36Z)
LLAMA: Multi-Feedback Smart Contract Fuzzing Framework with LLM-Guided Seed Generation [56.84049855266145]
We propose a Multi-feedback Smart Contract Fuzzing framework (LLAMA) that integrates evolutionary mutation strategies, and hybrid testing techniques.<n>LLAMA achieves 91% instruction coverage and 90% branch coverage, while detecting 132 out of 148 known vulnerabilities.<n>These results highlight LLAMA's effectiveness, adaptability, and practicality in real-world smart contract security testing scenarios.
arXiv Detail & Related papers (2025-07-16T09:46:58Z)
An Empirical Study of Fuzz Harness Degradation [24.989253174000922]
We study Google's OSS-Fuzz continuous fuzzing platform containing harnesses for 510 open-source C/C++ projects.<n>A harness is the glue code between the fuzzer and the project, so it needs to adapt to changes in the project.<n>Our analysis shows a consistent overall fuzzer coverage percentage for projects in OSS-Fuzz and a surprising longevity of the bug-finding capability of harnesses even without explicit updates.
arXiv Detail & Related papers (2025-05-09T16:39:20Z)
In the Magma chamber: Update and challenges in ground-truth vulnerabilities revival for automatic input generator comparison [42.95491588006701]
Magma introduced the notion of forward-porting to reintroduce vulnerable code in current software releases.<n>While their results are promising, the state-of-the-art lacks an update on the maintainability of this approach over time.<n>We characterise the challenges with forward-porting by reassessing the portability of Magma's CVEs four years after its release.
arXiv Detail & Related papers (2025-03-25T17:59:27Z)
Demystifying OS Kernel Fuzzing with a Novel Taxonomy [42.56259589772939]
We present the first systematic study dedicated to OS kernel fuzzing.<n>It begins by summarizing the progress of 99 academic studies from top-tier venues between 2017 and 2024.<n>We introduce a stage-based fuzzing model and a novel fuzzing taxonomy that highlights nine core functionalities unique to kernel fuzzing.
arXiv Detail & Related papers (2025-01-27T16:03:14Z)
FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks. FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z)
Vulnerability Detection Through an Adversarial Fuzzing Algorithm [2.074079789045646]
This project aims to increase the efficiency of existing fuzzers by allowing fuzzers to explore more paths and find more bugs in shorter amounts of time. adversarial methods are built on top of current evolutionary algorithms to generate test cases for further and more efficient fuzzing.
arXiv Detail & Related papers (2023-07-21T21:46:28Z)
Improving the Robustness of Summarization Systems with Dual Augmentation [68.53139002203118]
A robust summarization system should be able to capture the gist of the document, regardless of the specific word choices or noise in the input. We first explore the summarization models' robustness against perturbations including word-level synonym substitution and noise. We propose a SummAttacker, which is an efficient approach to generating adversarial samples based on language models.
arXiv Detail & Related papers (2023-06-01T19:04:17Z)
What Happens When We Fuzz? Investigating OSS-Fuzz Bug History [0.9772968596463595]
We analyzed 44,102 reported issues made public by OSS-Fuzz prior to March 12, 2022. We identified the bug-contributing commits to estimate when the bug containing code was introduced, and measure the timeline from introduction to detection to fix.
arXiv Detail & Related papers (2023-05-19T05:15:36Z)
Beyond the Spectrum: Detecting Deepfakes via Re-Synthesis [69.09526348527203]
Deep generative models have led to highly realistic media, known as deepfakes, that are commonly indistinguishable from real to human eyes. We propose a novel fake detection that is designed to re-synthesize testing images and extract visual cues for detection. We demonstrate the improved effectiveness, cross-GAN generalization, and robustness against perturbations of our approach in a variety of detection scenarios.
arXiv Detail & Related papers (2021-05-29T21:22:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.