A Survey of Modern Compiler Fuzzing
- URL: http://arxiv.org/abs/2306.06884v3
- Date: Fri, 16 Jun 2023 08:09:41 GMT
- Title: A Survey of Modern Compiler Fuzzing
- Authors: Haoyang Ma
- Abstract summary: This survey provides a summary of the research efforts for understanding and addressing compilers defects.
It covers researchers investigation and expertise on compilers bugs, such as their symptoms and root causes.
In addition, it covers researchers efforts in designing fuzzing techniques, including constructing test programs and designing test oracles.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Most software that runs on computers undergoes processing by compilers. Since
compilers constitute the fundamental infrastructure of software development,
their correctness is paramount. Over the years, researchers have invested in
analyzing, understanding, and characterizing the bug features over mainstream
compilers. These studies have demonstrated that compilers correctness requires
greater research attention, and they also pave the way for compiler fuzzing. To
improve compilers correctness, researchers have proposed numerous compiler
fuzzing techniques. These techniques were initially developed for testing
traditional compilers such as GCC/LLVM and have since been generalized to test
various newly developed, domain-specific compilers, such as graphics shader
compilers and deep learning (DL) compilers. In this survey, we provide a
comprehensive summary of the research efforts for understanding and addressing
compilers defects. Specifically, this survey mainly covers two aspects. First,
it covers researchers investigation and expertise on compilers bugs, such as
their symptoms and root causes. The compiler bug studies cover GCC/LLVM, JVM
compilers, and DL compilers. In addition, it covers researchers efforts in
designing fuzzing techniques, including constructing test programs and
designing test oracles. Besides discussing the existing work, this survey
outlines several open challenges and highlights research opportunities.
Related papers
- Towards Understanding the Bugs in Solidity Compiler [11.193701473232851]
This paper presents the first systematic study on 533 Solidity compiler bugs.
We examine their characteristics (including symptoms, root causes, and distribution) and their triggering test cases.
To study the limitations of Solidity compiler fuzzers, we evaluate three Solidity compiler fuzzers.
arXiv Detail & Related papers (2024-07-08T14:22:50Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - DevBench: A Comprehensive Benchmark for Software Development [72.24266814625685]
DevBench is a benchmark that evaluates large language models (LLMs) across various stages of the software development lifecycle.
Empirical studies show that current LLMs, including GPT-4-Turbo, fail to solve the challenges presented within DevBench.
Our findings offer actionable insights for the future development of LLMs toward real-world programming applications.
arXiv Detail & Related papers (2024-03-13T15:13:44Z) - Weak Memory Demands Model-based Compiler Testing [0.0]
A compiler bug arises if the behaviour of a compiled concurrent program, as allowed by its architecture memory model, is not a behaviour permitted by the source program under its source model.
We observe that processor implementations are increasingly exploiting the behaviour of relaxed architecture models.
arXiv Detail & Related papers (2024-01-12T15:50:32Z) - WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models [11.33856613057612]
We propose WhiteFox, the first white-box compiler fuzzer using Large Language Models with source-code information.
WhiteFox can generate high-quality test programs to exercise deep optimizations, practicing up to 8X more than state-of-the-art fuzzers.
WhiteFox has found 101 bugs for the DL compilers, with 92 confirmed as previously unknown and 70 fixed.
arXiv Detail & Related papers (2023-10-24T16:39:06Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Dcc --help: Generating Context-Aware Compiler Error Explanations with
Large Language Models [53.04357141450459]
dcc --help was deployed to our CS1 and CS2 courses, with 2,565 students using the tool over 64,000 times in ten weeks.
We found that the LLM-generated explanations were conceptually accurate in 90% of compile-time and 75% of run-time cases, but often disregarded the instruction not to provide solutions in code.
arXiv Detail & Related papers (2023-08-23T02:36:19Z) - HDCC: A Hyperdimensional Computing compiler for classification on
embedded systems and high-performance computing [58.720142291102135]
This work introduces the name compiler, the first open-source compiler that translates high-level descriptions of HDC classification methods into optimized C code.
name is designed like a modern compiler, featuring an intuitive and descriptive input language, an intermediate representation (IR), and a retargetable backend.
To substantiate these claims, we conducted experiments with HDCC on several of the most popular datasets in the HDC literature.
arXiv Detail & Related papers (2023-04-24T19:16:03Z) - CompilerGym: Robust, Performant Compiler Optimization Environments for
AI Research [26.06438868492976]
Interest in applying Artificial Intelligence (AI) techniques to compiler optimizations is increasing rapidly.
But compiler research has a high entry barrier.
We introduce CompilerGym, a set of environments for real world compiler optimization tasks.
We also introduce a toolkit for exposing new optimization tasks to compiler researchers.
arXiv Detail & Related papers (2021-09-17T01:02:27Z) - Configuring Test Generators using Bug Reports: A Case Study of GCC
Compiler and Csmith [2.1016374925364616]
This paper uses the code snippets in the bug reports to guide the test generation.
We evaluate this approach on eight versions of GCC.
We find that our approach provides higher coverage and triggers more miscompilation failures than the state-of-the-art test generation techniques for GCC.
arXiv Detail & Related papers (2020-12-19T11:25:13Z) - Extending C++ for Heterogeneous Quantum-Classical Computing [56.782064931823015]
qcor is a language extension to C++ and compiler implementation that enables heterogeneous quantum-classical programming, compilation, and execution in a single-source context.
Our work provides a first-of-its-kind C++ compiler enabling high-level quantum kernel (function) expression in a quantum-language manner.
arXiv Detail & Related papers (2020-10-08T12:49:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.