Make out like a (Multi-Armed) Bandit: Improving the Odds of Fuzzer Seed Scheduling with T-Scheduler
- URL: http://arxiv.org/abs/2312.04749v1
- Date: Thu, 7 Dec 2023 23:27:55 GMT
- Title: Make out like a (Multi-Armed) Bandit: Improving the Odds of Fuzzer Seed Scheduling with T-Scheduler
- Authors: Simon Luo, Adrian Herrera, Paul Quirk, Michael Chase, Damith C. Ranasinghe, Salil S. Kanhere,
- Abstract summary: Fuzzing is a highly-scalable software testing technique that uncovers bugs in a target program by executing it with mutated inputs.
We propose T-Scheduler, a seed scheduler built on multi-armed bandit theory.
We evaluate T-Scheduler over 35 CPU-yr of fuzzing, comparing it to 11 state-of-the-art schedulers.
- Score: 8.447499888458633
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Fuzzing is a highly-scalable software testing technique that uncovers bugs in a target program by executing it with mutated inputs. Over the life of a fuzzing campaign, the fuzzer accumulates inputs inducing new and interesting target behaviors, drawing from these inputs for further mutation. This rapidly results in a large number of inputs to select from, making it challenging to quickly and accurately select the "most promising" input for mutation. Reinforcement learning (RL) provides a natural solution to this "seed scheduling" problem: the fuzzer dynamically adapts its selection strategy by learning from past results. However, existing RL approaches are (a) computationally expensive (reducing fuzzer throughput) and/or (b) require hyperparameter tuning (reducing generality across targets and input types). To this end, we propose T-Scheduler, a seed scheduler built on multi-armed bandit theory that automatically adapts to the target without any hyperparameter tuning. We evaluate T-Scheduler over 35 CPU-yr of fuzzing, comparing it to 11 state-of-the-art schedulers. Our results show that T-Scheduler improves on these 11 schedulers on both bug-finding and coverage-expansion abilities.
Related papers
- $\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding [64.00025564372095]
Large language models (LLMs) have shown remarkable capabilities in code generation.
The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass.
We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
arXiv Detail & Related papers (2024-09-09T02:07:41Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - Fuzzing at Scale: The Untold Story of the Scheduler [0.48342038441006807]
We show that a well-designed strategy that determines which programs to fuzz and for how long can greatly impact the number of bugs found across the programs.
We develop several schedulers and leverage the most sophisticated one to fuzz simultaneously our newly compiled benchmark of around 5,000 Ubuntu programs, and detect 4908 bugs.
arXiv Detail & Related papers (2024-06-26T04:28:02Z) - FOX: Coverage-guided Fuzzing as Online Stochastic Control [13.3158115776899]
Fuzzing is an effective technique for discovering software vulnerabilities by generating random test inputs executing them against the target program.
This paper addresses the limitations of existing coverage-guided fuzzers, focusing on the scheduler and mutator components.
We present FOX, a proof-of-concept implementation of our control-theoretic approach, and compare it to industry-standard fuzzers.
arXiv Detail & Related papers (2024-06-06T21:21:05Z) - FABind: Fast and Accurate Protein-Ligand Binding [127.7790493202716]
$mathbfFABind$ is an end-to-end model that combines pocket prediction and docking to achieve accurate and fast protein-ligand binding.
Our proposed model demonstrates strong advantages in terms of effectiveness and efficiency compared to existing methods.
arXiv Detail & Related papers (2023-10-10T16:39:47Z) - Fuzzing with Quantitative and Adaptive Hot-Bytes Identification [6.442499249981947]
American fuzzy lop, a leading fuzzing tool, has demonstrated its powerful bug finding ability through a vast number of reported CVEs.
We propose an approach called toolwhich is designed based on the following principles.
Our evaluation results on 10 real-world programs and LAVA-M dataset show that toolachieves sustained increases in branch coverage and discovers more bugs than other fuzzers.
arXiv Detail & Related papers (2023-07-05T13:41:35Z) - Decoder Tuning: Efficient Language Understanding as Decoding [84.68266271483022]
We present Decoder Tuning (DecT), which in contrast optimize task-specific decoder networks on the output side.
By gradient-based optimization, DecT can be trained within several seconds and requires only one P query per sample.
We conduct extensive natural language understanding experiments and show that DecT significantly outperforms state-of-the-art algorithms with a $200times$ speed-up.
arXiv Detail & Related papers (2022-12-16T11:15:39Z) - NAPG: Non-Autoregressive Program Generation for Hybrid Tabular-Textual
Question Answering [52.10214317661547]
Current numerical reasoning methods autoregressively decode program sequences.
The accuracy of program generation drops sharply as the decoding steps unfold due to error propagation.
In this paper, we propose a non-autoregressive program generation framework.
arXiv Detail & Related papers (2022-11-07T11:25:21Z) - Random Feature Attention [69.4671822971207]
We propose RFA, a linear time and space attention that uses random feature methods to approximate the softmax function.
RFA can be used as a drop-in replacement for conventional softmax attention and offers a straightforward way of learning with recency bias through an optional gating mechanism.
Experiments on language modeling and machine translation demonstrate that RFA achieves similar or better performance compared to strong transformer baselines.
arXiv Detail & Related papers (2021-03-03T02:48:56Z) - MEUZZ: Smart Seed Scheduling for Hybrid Fuzzing [21.318110758739675]
Machine learning-Enhanced hybrid fUZZing system (MEUZZ)
MEUZZ determines which new seeds are expected to produce better fuzzing yields based on the knowledge learned from past seed scheduling decisions.
Results: MEUZZ significantly outperforms the state-of-the-art grey-box and hybrid fuzzers.
arXiv Detail & Related papers (2020-02-20T05:02:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.