Configuring Test Generators using Bug Reports: A Case Study of GCC
Compiler and Csmith
- URL: http://arxiv.org/abs/2012.10662v2
- Date: Thu, 18 Mar 2021 12:36:38 GMT
- Title: Configuring Test Generators using Bug Reports: A Case Study of GCC
Compiler and Csmith
- Authors: Md Rafiqul Islam Rabin and Mohammad Amin Alipour
- Abstract summary: This paper uses the code snippets in the bug reports to guide the test generation.
We evaluate this approach on eight versions of GCC.
We find that our approach provides higher coverage and triggers more miscompilation failures than the state-of-the-art test generation techniques for GCC.
- Score: 2.1016374925364616
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The correctness of compilers is instrumental in the safety and reliability of
other software systems, as bugs in compilers can produce executables that do
not reflect the intent of programmers. Such errors are difficult to identify
and debug. Random test program generators are commonly used in testing
compilers, and they have been effective in uncovering bugs. However, the
problem of guiding these test generators to produce test programs that are more
likely to find bugs remains challenging. In this paper, we use the code
snippets in the bug reports to guide the test generation. The main idea of this
work is to extract insights from the bug reports about the language features
that are more prone to inadequate implementation and using the insights to
guide the test generators. We use the GCC C compiler to evaluate the
effectiveness of this approach. In particular, we first cluster the test
programs in the GCC bugs reports based on their features. We then use the
centroids of the clusters to compute configurations for Csmith, a popular test
generator for C compilers. We evaluated this approach on eight versions of GCC
and found that our approach provides higher coverage and triggers more
miscompilation failures than the state-of-the-art test generation techniques
for GCC.
Related papers
- Evolutionary Generative Fuzzing for Differential Testing of the Kotlin
Compiler [14.259471945857431]
We investigate the effectiveness of differential testing in finding bugs within the Kotlin compilers developed at JetBrains.
We propose a black-box generative approach that creates input programs for the K1 and K2 compilers.
Our case study shows that the proposed approach effectively detects bugs in K1 and K2; these bugs have been confirmed and (some) fixed by JetBrains developers.
arXiv Detail & Related papers (2024-01-12T16:01:12Z) - Weak Memory Demands Model-based Compiler Testing [0.0]
A compiler bug arises if the behaviour of a compiled concurrent program, as allowed by its architecture memory model, is not a behaviour permitted by the source program under its source model.
We observe that processor implementations are increasingly exploiting the behaviour of relaxed architecture models.
arXiv Detail & Related papers (2024-01-12T15:50:32Z) - DebugBench: Evaluating Debugging Capability of Large Language Models [80.73121177868357]
DebugBench is a benchmark for Large Language Models (LLMs)
It covers four major bug categories and 18 minor types in C++, Java, and Python.
We evaluate two commercial and four open-source models in a zero-shot scenario.
arXiv Detail & Related papers (2024-01-09T15:46:38Z) - Compiler Testing With Relaxed Memory Models [0.0]
We present the T'el'echat compiler testing tool for concurrent programs.
T'el'echat compiles a concurrent C/C++ program and compares source and compiled program behaviours.
arXiv Detail & Related papers (2023-10-18T21:24:26Z) - Dcc --help: Generating Context-Aware Compiler Error Explanations with
Large Language Models [53.04357141450459]
dcc --help was deployed to our CS1 and CS2 courses, with 2,565 students using the tool over 64,000 times in ten weeks.
We found that the LLM-generated explanations were conceptually accurate in 90% of compile-time and 75% of run-time cases, but often disregarded the instruction not to provide solutions in code.
arXiv Detail & Related papers (2023-08-23T02:36:19Z) - Directed Test Program Generation for JIT Compiler Bug Localization [3.626013617212667]
Bug localization techniques for Just-in-Time (JIT) compilers are based on analyzing the execution behaviors of the target JIT compiler on a set of test programs generated for this purpose.
This paper proposes a novel technique for automatic test program generation for JIT compiler bug localization.
arXiv Detail & Related papers (2023-07-17T22:43:02Z) - A Survey of Modern Compiler Fuzzing [0.0]
This survey provides a summary of the research efforts for understanding and addressing compilers defects.
It covers researchers investigation and expertise on compilers bugs, such as their symptoms and root causes.
In addition, it covers researchers efforts in designing fuzzing techniques, including constructing test programs and designing test oracles.
arXiv Detail & Related papers (2023-06-12T06:03:51Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - Using Developer Discussions to Guide Fixing Bugs in Software [51.00904399653609]
We propose using bug report discussions, which are available before the task is performed and are also naturally occurring, avoiding the need for additional information from developers.
We demonstrate that various forms of natural language context derived from such discussions can aid bug-fixing, even leading to improved performance over using commit messages corresponding to the oracle bug-fixing commits.
arXiv Detail & Related papers (2022-11-11T16:37:33Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.