Related papers: Revisiting Neural Program Smoothing for Fuzzing

Revisiting Neural Program Smoothing for Fuzzing

URL: http://arxiv.org/abs/2309.16618v1
Date: Thu, 28 Sep 2023 17:17:11 GMT
Title: Revisiting Neural Program Smoothing for Fuzzing
Authors: Maria-Irina Nicolae, Max Eisele, Andreas Zeller
Abstract summary: This paper presents the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers. We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance. We present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers.
Score: 8.861172379630899
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Testing with randomly generated inputs (fuzzing) has gained significant traction due to its capacity to expose program vulnerabilities automatically. Fuzz testing campaigns generate large amounts of data, making them ideal for the application of machine learning (ML). Neural program smoothing (NPS), a specific family of ML-guided fuzzers, aims to use a neural network as a smooth approximation of the program target for new test case generation. In this paper, we conduct the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers (>11 CPU years and >5.5 GPU years), and make the following contributions: (1) We find that the original performance claims for NPS fuzzers do not hold; a gap we relate to fundamental, implementation, and experimental limitations of prior works. (2) We contribute the first in-depth analysis of the contribution of machine learning and gradient-based mutations in NPS. (3) We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance, but that standard gray-box fuzzers almost always surpass NPS-based fuzzers. (4) As a consequence, we propose new guidelines targeted at benchmarking fuzzing based on machine learning, and present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers. Neuzz++, MLFuzz, and all our data are public.

Related papers

FuzzSense: Towards A Modular Fuzzing Framework for Autonomous Driving Software [1.3359321655273804]
This research proposes FuzzSense, a modular, black-box, mutation-based fuzzing framework that is architected to ensemble diverse AD fuzzing tools. To validate the utility of FuzzSense, a LiDAR sensor fuzzer was developed as a plug-in, and the fuzzer was implemented in the new AD simulation platform AWSIM and Autoware.Universe AD software platform.
arXiv Detail & Related papers (2025-04-14T21:17:46Z)
CKGFuzzer: LLM-Based Fuzz Driver Generation Enhanced By Code Knowledge Graph [29.490817477791357]
We propose an automated fuzz testing method driven by a code knowledge graph and powered by an intelligent agent system. The code knowledge graph is constructed through interprocedural program analysis, where each node in the graph represents a code entity. CKGFuzzer achieved an average improvement of 8.73% in code coverage compared to state-of-the-art techniques.
arXiv Detail & Related papers (2024-11-18T12:41:16Z)
An Early FIRST Reproduction and Improvements to Single-Token Decoding for Fast Listwise Reranking [50.81324768683995]
FIRST is a novel approach that integrates a learning-to-rank objective and leveraging the logits of only the first generated token. We extend the evaluation of FIRST to the TREC Deep Learning datasets (DL19-22), validating its robustness across diverse domains. Our experiments confirm that fast reranking with single-token logits does not compromise out-of-domain reranking quality.
arXiv Detail & Related papers (2024-11-08T12:08:17Z)
Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP) For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z)
Comment on Revisiting Neural Program Smoothing for Fuzzing [34.32355705821806]
MLFuzz, a work accepted at ACM FSE 2023, revisits the performance of a machine learning-based fuzzer, NEUZZ. We demonstrate that its main conclusion is entirely wrong due to several fatal bugs in the implementation and wrong evaluation setups.
arXiv Detail & Related papers (2024-09-06T16:07:22Z)
FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks. FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z)
LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing [6.042114639413868]
Specialized fuzzers can handle complex structured data, but require additional efforts in grammar and suffer from low throughput. In this paper, we explore the potential of utilizing the Large Language Model to enhance greybox fuzzing for structured data. Our LLM-based fuzzer, LLAMAFUZZ, integrates the power of LLM to understand and mutate structured data to fuzzing.
arXiv Detail & Related papers (2024-06-11T20:48:28Z)
Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing [2.4287247817521096]
Vulnerabilities in BusyBox can have far-reaching consequences. The study revealed the prevalence of older BusyBox versions in real-world embedded products. We introduce two techniques to fortify software testing.
arXiv Detail & Related papers (2024-03-06T17:57:03Z)
Fuzzing for CPS Mutation Testing [3.512722797771289]
We propose a mutation testing approach that leverages fuzz testing, which has proved effective with C and C++ software. Our empirical evaluation shows that mutation testing based on fuzz testing kills a significantly higher proportion of live mutants than symbolic execution.
arXiv Detail & Related papers (2023-08-15T16:35:31Z)
Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy. To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN. Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z)
Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs) PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors. We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z)
Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box Estimation [87.53808756910452]
We propose a novel, flexible and accurate refinement module called Alpha-Refine. It exploits a precise pixel-wise correlation layer together with a spatial-aware non-local layer to fuse features and can predict three complementary outputs: bounding box, corners and mask. We apply the proposed Alpha-Refine module to five famous and state-of-the-art base trackers: DiMP, ATOM, SiamRPN++, RTMDNet and ECO.
arXiv Detail & Related papers (2020-07-04T07:02:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.