HuntFUZZ: Enhancing Error Handling Testing through Clustering Based Fuzzing
- URL: http://arxiv.org/abs/2407.04297v1
- Date: Fri, 5 Jul 2024 06:58:30 GMT
- Title: HuntFUZZ: Enhancing Error Handling Testing through Clustering Based Fuzzing
- Authors: Jin Wei, Ping Chen, Jun Dai, Xiaoyan Sun, Zhihao Zhang, Chang Xu, Yi Wanga,
- Abstract summary: This paper introduces HuntFUZZ, a novel SFI-based fuzzing framework that addresses the issue of redundant testing of error points with correlated paths.
We evaluate HuntFUZZ on a diverse set of 42 applications, and HuntFUZZ successfully reveals 162 known bugs, with 62 of them being related to error handling.
- Score: 19.31537246674011
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Testing a program's capability to effectively handling errors is a significant challenge, given that program errors are relatively uncommon. To solve this, Software Fault Injection (SFI)-based fuzzing integrates SFI and traditional fuzzing, injecting and triggering errors for testing (error handling) code. However, we observe that current SFI-based fuzzing approaches have overlooked the correlation between paths housing error points. In fact, the execution paths of error points often share common paths. Nonetheless, Fuzzers usually generate test cases repeatedly to test error points on commonly traversed paths. This practice can compromise the efficiency of the fuzzer(s). Thus, this paper introduces HuntFUZZ, a novel SFI-based fuzzing framework that addresses the issue of redundant testing of error points with correlated paths. Specifically, HuntFUZZ clusters these correlated error points and utilizes concolic execution to compute constraints only for common paths within each cluster. By doing so, we provide the fuzzer with efficient test cases to explore related error points with minimal redundancy. We evaluate HuntFUZZ on a diverse set of 42 applications, and HuntFUZZ successfully reveals 162 known bugs, with 62 of them being related to error handling. Additionally, due to its efficient error point detection method, HuntFUZZ discovers 7 unique zero-day bugs, which are all missed by existing fuzzers. Furthermore, we compare HuntFUZZ with 4 existing fuzzing approaches, including AFL, AFL++, AFLGo, and EH-FUZZ. Our evaluation confirms that HuntFUZZ can cover a broader range of error points, and it exhibits better performance in terms of bug finding speed.
Related papers
- Pipe-Cleaner: Flexible Fuzzing Using Security Policies [0.07499722271664144]
Pipe-Cleaner is a system for detecting and analyzing C code vulnerabilities.
It is based on flexible developer-designed security policies enforced by a tag-based runtime reference monitor.
We demonstrate the potential of this approach on several heap-related security vulnerabilities.
arXiv Detail & Related papers (2024-10-31T23:35:22Z) - Subtle Errors Matter: Preference Learning via Error-injected Self-editing [59.405145971637204]
We propose a novel preference learning framework called eRror-Injected Self-Editing (RISE)
RISE injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
Experiments validate the effectiveness of RISE, with preference learning on Qwen2-7B-Instruct yielding notable improvements of 3.0% on GSM8K and 7.9% on MATH.
arXiv Detail & Related papers (2024-10-09T07:43:38Z) - Comment on Revisiting Neural Program Smoothing for Fuzzing [34.32355705821806]
MLFuzz, a work accepted at ACM FSE 2023, revisits the performance of a machine learning-based fuzzer, NEUZZ.
We demonstrate that its main conclusion is entirely wrong due to several fatal bugs in the implementation and wrong evaluation setups.
arXiv Detail & Related papers (2024-09-06T16:07:22Z) - FuzzCoder: Byte-level Fuzzing Test via Large Language Model [46.18191648883695]
We propose to adopt fine-tuned large language models (FuzzCoder) to learn patterns in the input files from successful attacks.
FuzzCoder can predict mutation locations and strategies locations in input files to trigger abnormal behaviors of the program.
arXiv Detail & Related papers (2024-09-03T14:40:31Z) - Leveraging Stack Traces for Spectrum-based Fault Localization in the Absence of Failing Tests [44.13331329339185]
We introduce a new approach, SBEST, that integrates stack trace data with test coverage to enhance fault localization.
Our approach shows a significant improvement, increasing Mean Average Precision (MAP) by 32.22% and Mean Reciprocal Rank (MRR) by 17.43% over traditional stack trace ranking methods.
arXiv Detail & Related papers (2024-05-01T15:15:52Z) - Understanding and Mitigating Classification Errors Through Interpretable
Token Patterns [58.91023283103762]
Characterizing errors in easily interpretable terms gives insight into whether a classifier is prone to making systematic errors.
We propose to discover those patterns of tokens that distinguish correct and erroneous predictions.
We show that our method, Premise, performs well in practice.
arXiv Detail & Related papers (2023-11-18T00:24:26Z) - Fuzzing with Quantitative and Adaptive Hot-Bytes Identification [6.442499249981947]
American fuzzy lop, a leading fuzzing tool, has demonstrated its powerful bug finding ability through a vast number of reported CVEs.
We propose an approach called toolwhich is designed based on the following principles.
Our evaluation results on 10 real-world programs and LAVA-M dataset show that toolachieves sustained increases in branch coverage and discovers more bugs than other fuzzers.
arXiv Detail & Related papers (2023-07-05T13:41:35Z) - Fast and Accurate Error Simulation for CNNs against Soft Errors [64.54260986994163]
We present a framework for the reliability analysis of Conal Neural Networks (CNNs) via an error simulation engine.
These error models are defined based on the corruption patterns of the output of the CNN operators induced by faults.
We show that our methodology achieves about 99% accuracy of the fault effects w.r.t. SASSIFI, and a speedup ranging from 44x up to 63x w.r.t.FI, that only implements a limited set of error models.
arXiv Detail & Related papers (2022-06-04T19:45:02Z) - TACRED Revisited: A Thorough Evaluation of the TACRED Relation
Extraction Task [80.38130122127882]
TACRED is one of the largest, most widely used crowdsourced datasets in Relation Extraction (RE)
In this paper, we investigate the questions: Have we reached a performance ceiling or is there still room for improvement?
We find that label errors account for 8% absolute F1 test error, and that more than 50% of the examples need to be relabeled.
arXiv Detail & Related papers (2020-04-30T15:07:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.