Fuzzing the PHP Interpreter via Dataflow Fusion
- URL: http://arxiv.org/abs/2410.21713v1
- Date: Tue, 29 Oct 2024 03:54:59 GMT
- Title: Fuzzing the PHP Interpreter via Dataflow Fusion
- Authors: Yuancheng Jiang, Chuqi Zhang, Bonan Ruan, Jiahao Liu, Manuel Rigger, Roland Yap, Zhenkai Liang,
- Abstract summary: This paper introduces FlowFusion, the first automatic fuzzing framework specifically designed to detect memory errors in the PHP interpreter.
In our evaluation, FlowFusion identified 56 unknown memory errors in the PHP interpreter, with 38 fixed and 4 confirmed.
FlowFusion outperformed state-of-the-art fuzzers AFL++ and Polyglot, covering 24% more lines of code after 24 hours of fuzzing.
- Score: 13.303933700280343
- License:
- Abstract: PHP, a dominant scripting language in web development, powers a vast range of websites, from personal blogs to major platforms. While existing research primarily focuses on PHP application-level security issues like code injection, memory errors within the PHP interpreter have been largely overlooked. These memory errors, prevalent due to the PHP interpreter's extensive C codebase, pose significant risks to the confidentiality, integrity, and availability of PHP servers. This paper introduces FlowFusion, the first automatic fuzzing framework specifically designed to detect memory errors in the PHP interpreter. FlowFusion leverages dataflow as an efficient representation of test cases maintained by PHP developers, merging two or more test cases to produce fused test cases with more complex code semantics. Moreover, FlowFusion employs strategies such as test mutation, interface fuzzing, and environment crossover to further facilitate memory error detection. In our evaluation, FlowFusion identified 56 unknown memory errors in the PHP interpreter, with 38 fixed and 4 confirmed. We compared FlowFusion against the official test suite and a naive test concatenation approach, demonstrating that FlowFusion can detect new bugs that these methods miss, while also achieving greater code coverage. Furthermore, FlowFusion outperformed state-of-the-art fuzzers AFL++ and Polyglot, covering 24% more lines of code after 24 hours of fuzzing under identical execution environments. FlowFusion has been acknowledged by PHP developers, and we believe our approach offers a practical tool for enhancing the security of the PHP interpreter.
Related papers
- Pipe-Cleaner: Flexible Fuzzing Using Security Policies [0.07499722271664144]
Pipe-Cleaner is a system for detecting and analyzing C code vulnerabilities.
It is based on flexible developer-designed security policies enforced by a tag-based runtime reference monitor.
We demonstrate the potential of this approach on several heap-related security vulnerabilities.
arXiv Detail & Related papers (2024-10-31T23:35:22Z) - Yama: Precise Opcode-based Data Flow Analysis for Detecting PHP Applications Vulnerabilities [4.262259005587605]
Yama is a context-sensitive and path-sensitive interprocedural data flow analysis method for PHP.
We have found that the precise semantics and clear control flow of PHP opcodes enable data flow analysis to be more precise and efficient.
We evaluated Yama from three dimensions: basic data flow analysis capabilities, complex semantic analysis capabilities, and the ability to discover vulnerabilities in real-world applications.
arXiv Detail & Related papers (2024-10-16T08:14:37Z) - CRUXEval-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution [50.7413285637879]
The CRUXEVAL-X code reasoning benchmark contains 19 programming languages.
It comprises at least 600 subjects for each language, along with 19K content-consistent tests in total.
Even a model trained solely on Python can achieve at most 34.4% Pass@1 in other languages.
arXiv Detail & Related papers (2024-08-23T11:43:00Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - What All the PHUZZ Is About: A Coverage-guided Fuzzer for Finding Vulnerabilities in PHP Web Applications [5.169724825219126]
We introduce PHUZZ, a modular fuzzing framework for PHP web applications.
PHUZZ uses novel approaches to detect more client-side and server-side vulnerability classes than state-of-the-art related work.
We fuzz over 1,000 API endpoints of the 115 most popular WordPress plugins, resulting in over 20 security issues and 2 new CVE-IDs.
arXiv Detail & Related papers (2024-06-10T13:43:07Z) - FV8: A Forced Execution JavaScript Engine for Detecting Evasive Techniques [53.288368877654705]
FV8 is a modified V8 JavaScript engine designed to identify evasion techniques in JavaScript code.
It selectively enforces code execution on APIs that conditionally inject dynamic code.
It identifies 1,443 npm packages and 164 (82%) extensions containing at least one type of evasion.
arXiv Detail & Related papers (2024-05-21T19:54:19Z) - BUGSPHP: A dataset for Automated Program Repair in PHP [2.236957801565796]
This paper presents a benchmark dataset of bugs on real-world applications called BUGSPHP.
The training dataset includes more than 600,000 bug-fixing commits.
The test dataset contains 513 manually validated bug-fixing commits equipped with developer-provided test cases.
arXiv Detail & Related papers (2024-01-14T19:41:46Z) - WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models [11.33856613057612]
We propose WhiteFox, the first white-box compiler fuzzer using Large Language Models with source-code information.
WhiteFox can generate high-quality test programs to exercise deep optimizations, practicing up to 8X more than state-of-the-art fuzzers.
WhiteFox has found 101 bugs for the DL compilers, with 92 confirmed as previously unknown and 70 fixed.
arXiv Detail & Related papers (2023-10-24T16:39:06Z) - InterCode: Standardizing and Benchmarking Interactive Coding with
Execution Feedback [50.725076393314964]
We introduce InterCode, a lightweight, flexible, and easy-to-use framework of interactive coding as a standard reinforcement learning environment.
Our framework is language and platform agnostic, uses self-contained Docker environments to provide safe and reproducible execution.
We demonstrate InterCode's viability as a testbed by evaluating multiple state-of-the-art LLMs configured with different prompting strategies.
arXiv Detail & Related papers (2023-06-26T17:59:50Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - VELVET: a noVel Ensemble Learning approach to automatically locate
VulnErable sTatements [62.93814803258067]
This paper presents VELVET, a novel ensemble learning approach to locate vulnerable statements in source code.
Our model combines graph-based and sequence-based neural networks to successfully capture the local and global context of a program graph.
VELVET achieves 99.6% and 43.6% top-1 accuracy over synthetic data and real-world data, respectively.
arXiv Detail & Related papers (2021-12-20T22:45:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.