Revisiting Neural Program Smoothing for Fuzzing
- URL: http://arxiv.org/abs/2309.16618v1
- Date: Thu, 28 Sep 2023 17:17:11 GMT
- Title: Revisiting Neural Program Smoothing for Fuzzing
- Authors: Maria-Irina Nicolae, Max Eisele, Andreas Zeller
- Abstract summary: This paper presents the most extensive evaluation of NPS fuzzers against standard gray-box fuzzers.
We implement Neuzz++, which shows that addressing the practical limitations of NPS fuzzers improves performance.
We present MLFuzz, a platform with GPU access for easy and reproducible evaluation of ML-based fuzzers.
- Score: 8.861172379630899
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Testing with randomly generated inputs (fuzzing) has gained significant
traction due to its capacity to expose program vulnerabilities automatically.
Fuzz testing campaigns generate large amounts of data, making them ideal for
the application of machine learning (ML). Neural program smoothing (NPS), a
specific family of ML-guided fuzzers, aims to use a neural network as a smooth
approximation of the program target for new test case generation.
In this paper, we conduct the most extensive evaluation of NPS fuzzers
against standard gray-box fuzzers (>11 CPU years and >5.5 GPU years), and make
the following contributions: (1) We find that the original performance claims
for NPS fuzzers do not hold; a gap we relate to fundamental, implementation,
and experimental limitations of prior works. (2) We contribute the first
in-depth analysis of the contribution of machine learning and gradient-based
mutations in NPS. (3) We implement Neuzz++, which shows that addressing the
practical limitations of NPS fuzzers improves performance, but that standard
gray-box fuzzers almost always surpass NPS-based fuzzers. (4) As a consequence,
we propose new guidelines targeted at benchmarking fuzzing based on machine
learning, and present MLFuzz, a platform with GPU access for easy and
reproducible evaluation of ML-based fuzzers. Neuzz++, MLFuzz, and all our data
are public.
Related papers
- Unrolled denoising networks provably learn optimal Bayesian inference [54.79172096306631]
We prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP)
For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network converge to the same denoisers used in Bayes AMP.
arXiv Detail & Related papers (2024-09-19T17:56:16Z) - Comment on Revisiting Neural Program Smoothing for Fuzzing [34.32355705821806]
MLFuzz, a work accepted at ACM FSE 2023, revisits the performance of a machine learning-based fuzzer, NEUZZ.
We demonstrate that its main conclusion is entirely wrong due to several fatal bugs in the implementation and wrong evaluation setups.
arXiv Detail & Related papers (2024-09-06T16:07:22Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - LLAMAFUZZ: Large Language Model Enhanced Greybox Fuzzing [6.042114639413868]
Specialized fuzzers can handle complex structured data, but require additional efforts in grammar and suffer from low throughput.
In this paper, we explore the potential of utilizing the Large Language Model to enhance greybox fuzzing for structured data.
Our LLM-based fuzzer, LLAMAFUZZ, integrates the power of LLM to understand and mutate structured data to fuzzing.
arXiv Detail & Related papers (2024-06-11T20:48:28Z) - Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug
Unearthing [2.4287247817521096]
Vulnerabilities in BusyBox can have far-reaching consequences.
The study revealed the prevalence of older BusyBox versions in real-world embedded products.
We introduce two techniques to fortify software testing.
arXiv Detail & Related papers (2024-03-06T17:57:03Z) - Fuzzing for CPS Mutation Testing [3.512722797771289]
We propose a mutation testing approach that leverages fuzz testing, which has proved effective with C and C++ software.
Our empirical evaluation shows that mutation testing based on fuzz testing kills a significantly higher proportion of live mutants than symbolic execution.
arXiv Detail & Related papers (2023-08-15T16:35:31Z) - Sample-Then-Optimize Batch Neural Thompson Sampling [50.800944138278474]
We introduce two algorithms for black-box optimization based on the Thompson sampling (TS) policy.
To choose an input query, we only need to train an NN and then choose the query by maximizing the trained NN.
Our algorithms sidestep the need to invert the large parameter matrix yet still preserve the validity of the TS policy.
arXiv Detail & Related papers (2022-10-13T09:01:58Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - RSG: A Simple but Effective Module for Learning Imbalanced Datasets [99.77194308426606]
We propose a new rare-class sample generator (RSG) to generate new samples for rare classes during training.
RSG is convenient to use and highly versatile, because it can be easily integrated intoany kind of convolutional neural network.
We obtain competitive results on Imbalanced CIFAR, ImageNet-LT, and iNaturalist 2018 using RSG.
arXiv Detail & Related papers (2021-06-18T01:10:27Z) - Alpha-Refine: Boosting Tracking Performance by Precise Bounding Box
Estimation [87.53808756910452]
We propose a novel, flexible and accurate refinement module called Alpha-Refine.
It exploits a precise pixel-wise correlation layer together with a spatial-aware non-local layer to fuse features and can predict three complementary outputs: bounding box, corners and mask.
We apply the proposed Alpha-Refine module to five famous and state-of-the-art base trackers: DiMP, ATOM, SiamRPN++, RTMDNet and ECO.
arXiv Detail & Related papers (2020-07-04T07:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.