PTE: Axiomatic Semantics based Compiler Testing
- URL: http://arxiv.org/abs/2401.01036v1
- Date: Tue, 2 Jan 2024 04:50:47 GMT
- Title: PTE: Axiomatic Semantics based Compiler Testing
- Authors: Guoliang Dong, Jun Sun, Richard Schumi, Bo Wang, Xinyu Wang
- Abstract summary: We propose an axiomatic semantics based approach for testing compilers, called PTE.
The idea is to incrementally develop a set of axioms'' capturing anecdotes of the language semantics in the form of emph(textbfprecondition, textbftransformation, textbfexpectation''
- Score: 7.203331838793759
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The correctness of a compiler affects the correctness of every program
written in the language, and thus must be thoroughly evaluated. Existing
automatic compiler testing methods however either rely on weak oracles (e.g., a
program behaves the same if only dead code is modified), or require substantial
initial effort (e.g., having a complete operational language semantics). While
the former prevents a comprehensive correctness evaluation, the latter makes
those methods irrelevant in practice. In this work, we propose an axiomatic
semantics based approach for testing compilers, called PTE. The idea is to
incrementally develop a set of ``axioms'' capturing anecdotes of the language
semantics in the form of \emph{(\textbf{p}recondition, \textbf{t}ransformation,
\textbf{e}xpectation) triples, which allows us to test the compiler
automatically.} Such axioms are written in the same language whose compiler is
under test, and can be developed either based on the language specification, or
by generalizing the bug reports. PTE has been applied to a newly developed
compiler (i.e., Cangjie) and a mature compiler (i.e., Java), and successfully
identified 42 implementation bugs and 9 potential language design issues.
Related papers
- Developing a Modular Compiler for a Subset of a C-like Language [0.0]
The paper introduces the development of a modular compiler for a subset of a C-like language.
This modular approach will allow developers to modify a language by adding or removing subsets as required, resulting in a minimal and memory-efficient compiler.
arXiv Detail & Related papers (2025-01-08T13:42:54Z) - Finding Missed Code Size Optimizations in Compilers using LLMs [1.90019787465083]
We develop a novel testing approach which combines large language models with a series of differential testing strategies.
Our approach requires fewer than 150 lines of code to implement.
To date we have reported 24 confirmed bugs in production compilers.
arXiv Detail & Related papers (2024-12-31T21:47:46Z) - Evolutionary Generative Fuzzing for Differential Testing of the Kotlin
Compiler [14.259471945857431]
We investigate the effectiveness of differential testing in finding bugs within the Kotlin compilers developed at JetBrains.
We propose a black-box generative approach that creates input programs for the K1 and K2 compilers.
Our case study shows that the proposed approach effectively detects bugs in K1 and K2; these bugs have been confirmed and (some) fixed by JetBrains developers.
arXiv Detail & Related papers (2024-01-12T16:01:12Z) - Extended Paper: API-driven Program Synthesis for Testing Static Typing
Implementations [11.300829269111627]
We introduce a novel approach for testing static typing implementations based on the concept of API-driven program synthesis.
The idea is to synthesize type-intensive but small and well-typed programs by leveraging and combining application programming interfaces (APIs) derived from existing software libraries.
arXiv Detail & Related papers (2023-11-08T08:32:40Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Natural Language Embedded Programs for Hybrid Language Symbolic Reasoning [84.12154024070024]
We propose natural language embedded programs (NLEP) as a unifying framework for addressing math/symbolic reasoning, natural language understanding, and instruction following tasks.
Our approach prompts a language model to generate full Python programs that define functions over data structures which contain natural language representations of structured knowledge.
A Python interpreter then executes the generated code and prints the output.
arXiv Detail & Related papers (2023-09-19T17:54:21Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - Effective Random Test Generation for Deep Learning Compilers [16.065653480978092]
Isra is a domain-specific constraint solver that resolves the constraints from the semantic specifications without backtracking.
We implement and apply our approach to three popular real-world deep learning compilers including TVM, Glow, and a commercial compiler named SophGo.
Isra is more effective than the state-of-the-art approaches and the baseline approaches on constructing valid test inputs for compiler-bug detection.
arXiv Detail & Related papers (2023-02-02T03:00:36Z) - Interactive Code Generation via Test-Driven User-Intent Formalization [60.90035204567797]
Large language models (LLMs) produce code from informal natural language (NL) intent.
It is hard to define a notion of correctness since natural language can be ambiguous and lacks a formal semantics.
We describe a language-agnostic abstract algorithm and a concrete implementation TiCoder.
arXiv Detail & Related papers (2022-08-11T17:41:08Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z) - Zero-Shot Cross-lingual Semantic Parsing [56.95036511882921]
We study cross-lingual semantic parsing as a zero-shot problem without parallel data for 7 test languages.
We propose a multi-task encoder-decoder model to transfer parsing knowledge to additional languages using only English-Logical form paired data.
Our system frames zero-shot parsing as a latent-space alignment problem and finds that pre-trained models can be improved to generate logical forms with minimal cross-lingual transfer penalty.
arXiv Detail & Related papers (2021-04-15T16:08:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.