PTE: Axiomatic Semantics based Compiler Testing
- URL: http://arxiv.org/abs/2401.01036v1
- Date: Tue, 2 Jan 2024 04:50:47 GMT
- Title: PTE: Axiomatic Semantics based Compiler Testing
- Authors: Guoliang Dong, Jun Sun, Richard Schumi, Bo Wang, Xinyu Wang
- Abstract summary: We propose an axiomatic semantics based approach for testing compilers, called PTE.
The idea is to incrementally develop a set of axioms'' capturing anecdotes of the language semantics in the form of emph(textbfprecondition, textbftransformation, textbfexpectation''
- Score: 7.203331838793759
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The correctness of a compiler affects the correctness of every program
written in the language, and thus must be thoroughly evaluated. Existing
automatic compiler testing methods however either rely on weak oracles (e.g., a
program behaves the same if only dead code is modified), or require substantial
initial effort (e.g., having a complete operational language semantics). While
the former prevents a comprehensive correctness evaluation, the latter makes
those methods irrelevant in practice. In this work, we propose an axiomatic
semantics based approach for testing compilers, called PTE. The idea is to
incrementally develop a set of ``axioms'' capturing anecdotes of the language
semantics in the form of \emph{(\textbf{p}recondition, \textbf{t}ransformation,
\textbf{e}xpectation) triples, which allows us to test the compiler
automatically.} Such axioms are written in the same language whose compiler is
under test, and can be developed either based on the language specification, or
by generalizing the bug reports. PTE has been applied to a newly developed
compiler (i.e., Cangjie) and a mature compiler (i.e., Java), and successfully
identified 42 implementation bugs and 9 potential language design issues.
Related papers
- Evolutionary Generative Fuzzing for Differential Testing of the Kotlin
Compiler [14.259471945857431]
We investigate the effectiveness of differential testing in finding bugs within the Kotlin compilers developed at JetBrains.
We propose a black-box generative approach that creates input programs for the K1 and K2 compilers.
Our case study shows that the proposed approach effectively detects bugs in K1 and K2; these bugs have been confirmed and (some) fixed by JetBrains developers.
arXiv Detail & Related papers (2024-01-12T16:01:12Z) - Weak Memory Demands Model-based Compiler Testing [0.0]
A compiler bug arises if the behaviour of a compiled concurrent program, as allowed by its architecture memory model, is not a behaviour permitted by the source program under its source model.
We observe that processor implementations are increasingly exploiting the behaviour of relaxed architecture models.
arXiv Detail & Related papers (2024-01-12T15:50:32Z) - Extended Paper: API-driven Program Synthesis for Testing Static Typing
Implementations [11.300829269111627]
We introduce a novel approach for testing static typing implementations based on the concept of API-driven program synthesis.
The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries.
arXiv Detail & Related papers (2023-11-08T08:32:40Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning [84.12154024070024]
We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks.
Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge.
A Python interpreter then executes the generated code and prints the output.
arXiv Detail & Related papers (2023-09-19T17:54:21Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Interactive Code Generation via Test-Driven User-Intent Formalization [60.90035204567797]
Large language models (LLMs) produce code from informal natural language (NL) intent.
It is hard to define a notion of correctness since natural language can be ambiguous and lacks a formal semantics.
We describe a language-agnostic abstract algorithm and a concrete implementation TiCoder.
arXiv Detail & Related papers (2022-08-11T17:41:08Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z) - Configuring Test Generators using Bug Reports: A Case Study of GCC
Compiler and Csmith [2.1016374925364616]
This paper uses the code snippets in the bug reports to guide the test generation.
We evaluate this approach on eight versions of GCC.
We find that our approach provides higher coverage and triggers more miscompilation failures than the state-of-the-art test generation techniques for GCC.
arXiv Detail & Related papers (2020-12-19T11:25:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.