Using a Sledgehammer to Crack a Nut? Revisiting Automated Compiler Fault Isolation
- URL: http://arxiv.org/abs/2512.16335v1
- Date: Thu, 18 Dec 2025 09:22:57 GMT
- Title: Using a Sledgehammer to Crack a Nut? Revisiting Automated Compiler Fault Isolation
- Authors: Yibiao Yang, Qingyang Li, Maolin Sun, Jiangchang Wu, Yuming Zhou,
- Abstract summary: This study aims to directly compare a BIC-based strategy, Basic, with representative SBFL techniques in the context of compiler fault localization.<n>Basic identifies the most recent good release and earliest bad release, and then employs a binary search to pinpoint the bug-inducing commit.<n>We rigorously compare Basic against SBFL-based techniques using a benchmark consisting of 60 GCC bugs and 60 LLVM bugs.
- Score: 9.699231545580806
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Background: Compilers are fundamental to software development, translating high-level source code into executable software systems. Faults in compilers can have severe consequences and thus effective localization and resolution of compiler bugs are crucial. Problem: In practice, developers often examine version history to identify and investigate bug-inducing commit (BIC) for fixing bugs. However, while numerous sophisticated Spectrum-Based Fault Localization (SBFL) techniques have been proposed for compiler fault isolation, their effectiveness has not been evaluated against the BIC-based strategies widely adopted in practice. Objective: This study aims to bridge this gap by directly comparing a BIC-based strategy, Basic, with representative SBFL techniques in the context of compiler fault localization. The BIC-based strategy closely aligns with common developer practices, as it directly identifies the BIC and treats the files modified in that commit as faulty candidates. Method: The Basic identifies the most recent good release and earliest bad release, and then employs a binary search to pinpoint the bug-inducing commit. All files modified in the identified commit are flagged as potentially faulty. We rigorously compare Basic against SBFL-based techniques using a benchmark consisting of 60 GCC bugs and 60 LLVM bugs. Result: Our analysis reveals that Basic performs comparably to, and in many cases outperforms, state-of-the-art SBFL-based techniques, particularly on the critical Top-1 and Top-5 ranking metrics. Conclusion: This study provides new insights into the practical effectiveness of SBFL-based techniques in real-world compiler debugging scenarios. We recommend that future research adopt Basic as a baseline when developing and evaluating new compiler fault isolation methods.
Related papers
- Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All [57.23434868678603]
Live-kBench is an evaluation framework for self-evolving benchmarks that scrapes and evaluates agents on freshly discovered kernel bugs.<n> kEnv is an agent-agnostic crash-resolution environment for kernel compilation, execution, and feedback.<n>Using kEnv, we benchmark three state-of-the-art agents, showing that they resolve 74% of crashes on the first attempt.
arXiv Detail & Related papers (2026-02-02T19:06:15Z) - LibContinual: A Comprehensive Library towards Realistic Continual Learning [62.34449396069085]
A fundamental challenge in Continual Learning (CL) is catastrophic forgetting, where adapting to new tasks degrades the performance on previous ones.<n>We propose LibContinual, a comprehensive and reproducible library designed to serve as a foundational platform for realistic CL.
arXiv Detail & Related papers (2025-12-26T13:59:13Z) - LLMBisect: Breaking Barriers in Bug Bisection with A Comparative Analysis Pipeline [35.18683484280968]
Large Language Models (LLMs) are well-positioned to break the barriers of existing solutions.<n>LLMs comprehend both textual data and code in patches and commits.<n>Our approach achieves significantly better accuracy than the state-of-the-art solution by more than 38%.
arXiv Detail & Related papers (2025-10-30T02:47:25Z) - Improving Compiler Bug Isolation by Leveraging Large Language Models [14.679589768900621]
We propose an innovative compiler bug isolation approach named AutoCBI.<n>We evaluate AutoCBI against state-of-the-art approaches (DiWi, RecBi and FuseFL) on 120 real-world bugs from the widely-used GCC and LLVM compilers.<n>Specifically, AutoCBI isolates 66.67%/69.23%, 300%/340%, and 100%/57.14% more bugs than RecBi, DiWi, and FuseFL, respectively, in the Top-1 ranked results for GCC/LLVM.
arXiv Detail & Related papers (2025-06-21T09:09:30Z) - Is Compression Really Linear with Code Intelligence? [60.123628177110206]
textitFormat Annealing is a lightweight, transparent training methodology designed to assess the intrinsic capabilities of pre-trained models equitably.<n>Our empirical results reveal a fundamental logarithmic relationship between measured code intelligence and bits-per-character (BPC)<n>Our work provides a more nuanced understanding of compression's role in developing code intelligence and contributes a robust evaluation framework in the code domain.
arXiv Detail & Related papers (2025-05-16T16:59:14Z) - Identifying Bug Inducing Commits by Combining Fault Localisation and Code Change Histories [10.027862394831669]
A Bug Inducing Commit (BIC) is a code change that introduces a bug into the commit.<n>We propose a technique called Fonte that aims to identify the BIC with a core concept that a commit is more likely to be a BIC if it has more recently modified code elements.<n>Fonte significantly outperforms state-of-the-art BIC identification techniques, achieving up to 45.8% higher MRR.
arXiv Detail & Related papers (2025-02-18T15:02:22Z) - ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z) - DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components.
Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z) - FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [51.898805184427545]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.<n>We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.<n>We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z) - Patch2QL: Discover Cognate Defects in Open Source Software Supply Chain
With Auto-generated Static Analysis Rules [1.9591497166224197]
We propose a novel technique for detecting cognate defects in OSS through the automatic generation of SAST rules.
Specifically, it extracts key syntax and semantic information from pre- and post-patch versions of code.
We have implemented a prototype tool called Patch2QL and applied it to fundamental OSS in C/C++.
arXiv Detail & Related papers (2024-01-23T02:23:11Z) - Fully Autonomous Programming with Large Language Models [0.9558392439655015]
Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndrome"
We use OpenAI Codex as the LLM and Program Synthesis Benchmark 2 as a database of problem descriptions and tests for evaluation.
The resulting framework outperforms both conventional usage of Codex without the repair phase and traditional genetic programming approaches.
arXiv Detail & Related papers (2023-04-20T16:12:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.