DESIL: Detecting Silent Bugs in MLIR Compiler Infrastructure
- URL: http://arxiv.org/abs/2504.01379v1
- Date: Wed, 02 Apr 2025 05:48:51 GMT
- Title: DESIL: Detecting Silent Bugs in MLIR Compiler Infrastructure
- Authors: Chenyao Suo, Jianrong Wang, Yongjia Wang, Jiajun Jiang, QingChao Shen, Junjie Chen,
- Abstract summary: DESIL enables silent bug detection by defining a set of UB-elimination rules based on the MLIR documentation.<n> DESIL incorporates the differential testing for silent bug detection.<n>It detected 23 silent bugs and 19 crash bugs, of which 12/14 have been confirmed or fixed.
- Score: 8.760618981915016
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: MLIR (Multi-Level Intermediate Representation) compiler infrastructure provides an efficient framework for introducing a new abstraction level for programming languages and domain-specific languages. It has attracted widespread attention in recent years and has been applied in various domains, such as deep learning compiler construction. Recently, several MLIR compiler fuzzing techniques, such as MLIRSmith and MLIRod, have been proposed. However, none of them can detect silent bugs, i.e., bugs that incorrectly optimize code silently. The difficulty in detecting silent bugs arises from two main aspects: (1) UB-Free Program Generation: Ensures the generated programs are free from undefined behaviors to suit the non-UB assumptions required by compiler optimizations. (2) Lowering Support: Converts the given MLIR program into an executable form, enabling execution result comparisons, and selects a suitable lowering path for the program to reduce redundant lowering pass and improve the efficiency of fuzzing. To address the above issues, we propose DESIL. DESIL enables silent bug detection by defining a set of UB-elimination rules based on the MLIR documentation and applying them to input programs to produce UB-free MLIR programs. To convert dialects in MLIR program into the executable form, DESIL designs a lowering path optimization strategy to convert the dialects in given MLIR program into executable form. Furthermore, DESIL incorporates the differential testing for silent bug detection. To achieve this, it introduces an operation-aware optimization recommendation strategy into the compilation process to generate diverse executable files. We applied DESIL to the latest revisions of the MLIR compiler infrastructure. It detected 23 silent bugs and 19 crash bugs, of which 12/14 have been confirmed or fixed
Related papers
- ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - Finding Missed Code Size Optimizations in Compilers using LLMs [1.90019787465083]
We develop a novel testing approach which combines large language models with a series of differential testing strategies.<n>Our approach requires fewer than 150 lines of code to implement.<n>To date we have reported 24 confirmed bugs in production compilers.
arXiv Detail & Related papers (2024-12-31T21:47:46Z) - $\mathbb{USCD}$: Improving Code Generation of LLMs by Uncertainty-Aware Selective Contrastive Decoding [64.00025564372095]
Large language models (LLMs) have shown remarkable capabilities in code generation.
The effects of hallucinations (e.g., output noise) make it challenging for LLMs to generate high-quality code in one pass.
We propose a simple and effective textbfuncertainty-aware textbfselective textbfcontrastive textbfdecoding.
arXiv Detail & Related papers (2024-09-09T02:07:41Z) - Galapagos: Automated N-Version Programming with LLMs [10.573037638807024]
We propose the automated generation of program variants using large language models.<n>We design, develop and evaluate Gal'apagos: a tool for generating program variants.<n>We evaluate Gal'apagos by creating N-Version components of real-world C code.
arXiv Detail & Related papers (2024-08-18T16:44:01Z) - mlirSynth: Automatic, Retargetable Program Raising in Multi-Level IR
using Program Synthesis [48.01697184432969]
mlirSynth translates programs from lower-level MLIR dialects to high-level ones without manually defined rules.
We demonstrate its effectiveness reviby raising C programs to two distinct high-level MLIR dialects, which enables us to use existing high-level dialect specific compilation flows.
arXiv Detail & Related papers (2023-10-06T12:21:50Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Isolating Compiler Bugs by Generating Effective Witness Programs with Large Language Models [10.660543763757518]
Existing compiler bug isolation approaches convert the problem into a test program mutation problem.
We propose a new approach named LLM4CBI to utilize LLMs to generate effective test programs for compiler bug isolation.
Compared with state-of-the-art approaches over 120 real bugs from GCC and LLVM, our evaluation demonstrates the advantages of LLM4CBI.
arXiv Detail & Related papers (2023-07-02T15:20:54Z) - CryptOpt: Verified Compilation with Randomized Program Search for
Cryptographic Primitives (full version) [12.790826917588575]
cryptography has been an exception, where many performance-critical routines have been written directly in assembly.
We present CryptOpt, the first compilation pipeline that specializes high-level cryptographic functional programs into assembly code significantly faster than what GCC or Clang produce.
On the formal-verification side, we connect to the FiatOpt framework (which translates functional programs into C-like IR code) and extend it with a new formally verified program-equivalence checker.
arXiv Detail & Related papers (2022-11-19T11:07:39Z) - Fault-Aware Neural Code Rankers [64.41888054066861]
We propose fault-aware neural code rankers that can predict the correctness of a sampled program without executing it.
Our fault-aware rankers can significantly increase the pass@1 accuracy of various code generation models.
arXiv Detail & Related papers (2022-06-04T22:01:05Z) - Natural Language to Code Translation with Execution [82.52142893010563]
Execution result--minimum Bayes risk decoding for program selection.
We show that it improves the few-shot performance of pretrained code models on natural-language-to-code tasks.
arXiv Detail & Related papers (2022-04-25T06:06:08Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.