Related papers: Using a Sledgehammer to Crack a Nut? Revisiting Automated Compiler Fault Isolation

Using a Sledgehammer to Crack a Nut? Revisiting Automated Compiler Fault Isolation

URL: http://arxiv.org/abs/2512.16335v1
Date: Thu, 18 Dec 2025 09:22:57 GMT
Title: Using a Sledgehammer to Crack a Nut? Revisiting Automated Compiler Fault Isolation
Authors: Yibiao Yang, Qingyang Li, Maolin Sun, Jiangchang Wu, Yuming Zhou,
Abstract summary: This study aims to directly compare a BIC-based strategy, Basic, with representative SBFL techniques in the context of compiler fault localization.<n>Basic identifies the most recent good release and earliest bad release, and then employs a binary search to pinpoint the bug-inducing commit.<n>We rigorously compare Basic against SBFL-based techniques using a benchmark consisting of 60 GCC bugs and 60 LLVM bugs.
Score: 9.699231545580806
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Background: Compilers are fundamental to software development, translating high-level source code into executable software systems. Faults in compilers can have severe consequences and thus effective localization and resolution of compiler bugs are crucial. Problem: In practice, developers often examine version history to identify and investigate bug-inducing commit (BIC) for fixing bugs. However, while numerous sophisticated Spectrum-Based Fault Localization (SBFL) techniques have been proposed for compiler fault isolation, their effectiveness has not been evaluated against the BIC-based strategies widely adopted in practice. Objective: This study aims to bridge this gap by directly comparing a BIC-based strategy, Basic, with representative SBFL techniques in the context of compiler fault localization. The BIC-based strategy closely aligns with common developer practices, as it directly identifies the BIC and treats the files modified in that commit as faulty candidates. Method: The Basic identifies the most recent good release and earliest bad release, and then employs a binary search to pinpoint the bug-inducing commit. All files modified in the identified commit are flagged as potentially faulty. We rigorously compare Basic against SBFL-based techniques using a benchmark consisting of 60 GCC bugs and 60 LLVM bugs. Result: Our analysis reveals that Basic performs comparably to, and in many cases outperforms, state-of-the-art SBFL-based techniques, particularly on the critical Top-1 and Top-5 ranking metrics. Conclusion: This study provides new insights into the practical effectiveness of SBFL-based techniques in real-world compiler debugging scenarios. We recommend that future research adopt Basic as a baseline when developing and evaluating new compiler fault isolation methods.

Related papers

Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All [57.23434868678603]
Live-kBench is an evaluation framework for self-evolving benchmarks that scrapes and evaluates agents on freshly discovered kernel bugs.<n> kEnv is an agent-agnostic crash-resolution environment for kernel compilation, execution, and feedback.<n>Using kEnv, we benchmark three state-of-the-art agents, showing that they resolve 74% of crashes on the first attempt.
arXiv Detail & Related papers (2026-02-02T19:06:15Z)
LibContinual: A Comprehensive Library towards Realistic Continual Learning [62.34449396069085]
A fundamental challenge in Continual Learning (CL) is catastrophic forgetting, where adapting to new tasks degrades the performance on previous ones.<n>We propose LibContinual, a comprehensive and reproducible library designed to serve as a foundational platform for realistic CL.
arXiv Detail & Related papers (2025-12-26T13:59:13Z)
LLMBisect: Breaking Barriers in Bug Bisection with A Comparative Analysis Pipeline [35.18683484280968]
Large Language Models (LLMs) are well-positioned to break the barriers of existing solutions.<n>LLMs comprehend both textual data and code in patches and commits.<n>Our approach achieves significantly better accuracy than the state-of-the-art solution by more than 38%.
arXiv Detail & Related papers (2025-10-30T02:47:25Z)
Improving Compiler Bug Isolation by Leveraging Large Language Models [14.679589768900621]
We propose an innovative compiler bug isolation approach named AutoCBI.<n>We evaluate AutoCBI against state-of-the-art approaches (DiWi, RecBi and FuseFL) on 120 real-world bugs from the widely-used GCC and LLVM compilers.<n>Specifically, AutoCBI isolates 66.67%/69.23%, 300%/340%, and 100%/57.14% more bugs than RecBi, DiWi, and FuseFL, respectively, in the Top-1 ranked results for GCC/LLVM.
arXiv Detail & Related papers (2025-06-21T09:09:30Z)
Is Compression Really Linear with Code Intelligence? [60.123628177110206]
textitFormat Annealing is a lightweight, transparent training methodology designed to assess the intrinsic capabilities of pre-trained models equitably.<n>Our empirical results reveal a fundamental logarithmic relationship between measured code intelligence and bits-per-character (BPC)<n>Our work provides a more nuanced understanding of compression's role in developing code intelligence and contributes a robust evaluation framework in the code domain.
arXiv Detail & Related papers (2025-05-16T16:59:14Z)
Identifying Bug Inducing Commits by Combining Fault Localisation and Code Change Histories [10.027862394831669]
A Bug Inducing Commit (BIC) is a code change that introduces a bug into the commit.<n>We propose a technique called Fonte that aims to identify the BIC with a core concept that a commit is more likely to be a BIC if it has more recently modified code elements.<n>Fonte significantly outperforms state-of-the-art BIC identification techniques, achieving up to 45.8% higher MRR.
arXiv Detail & Related papers (2025-02-18T15:02:22Z)
ReF Decompile: Relabeling and Function Call Enhanced Decompile [50.86228893636785]
The goal of decompilation is to convert compiled low-level code (e.g., assembly code) back into high-level programming languages.<n>This task supports various reverse engineering applications, such as vulnerability identification, malware analysis, and legacy software migration.
arXiv Detail & Related papers (2025-02-17T12:38:57Z)
DOCE: Finding the Sweet Spot for Execution-Based Code Generation [69.5305729627198]
We propose a comprehensive framework that includes candidate generation, $n$-best reranking, minimum Bayes risk (MBR) decoding, and self-ging as the core components. Our findings highlight the importance of execution-based methods and the difference gap between execution-based and execution-free methods.
arXiv Detail & Related papers (2024-08-25T07:10:36Z)
FoC: Figure out the Cryptographic Functions in Stripped Binaries with LLMs [51.898805184427545]
We propose a novel framework called FoC to Figure out the Cryptographic functions in stripped binaries.<n>We first build a binary large language model (FoC-BinLLM) to summarize the semantics of cryptographic functions in natural language.<n>We then build a binary code similarity model (FoC-Sim) upon the FoC-BinLLM to create change-sensitive representations and use it to retrieve similar implementations of unknown cryptographic functions in a database.
arXiv Detail & Related papers (2024-03-27T09:45:33Z)
Patch2QL: Discover Cognate Defects in Open Source Software Supply Chain With Auto-generated Static Analysis Rules [1.9591497166224197]
We propose a novel technique for detecting cognate defects in OSS through the automatic generation of SAST rules. Specifically, it extracts key syntax and semantic information from pre- and post-patch versions of code. We have implemented a prototype tool called Patch2QL and applied it to fundamental OSS in C/C++.
arXiv Detail & Related papers (2024-01-23T02:23:11Z)
Fully Autonomous Programming with Large Language Models [0.9558392439655015]
Current approaches to program synthesis with Large Language Models (LLMs) exhibit a "near miss syndrome" We use OpenAI Codex as the LLM and Program Synthesis Benchmark 2 as a database of problem descriptions and tests for evaluation. The resulting framework outperforms both conventional usage of Codex without the repair phase and traditional genetic programming approaches.
arXiv Detail & Related papers (2023-04-20T16:12:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.