In industrial embedded software, are some compilation errors easier to localize and fix than others?
- URL: http://arxiv.org/abs/2404.14823v1
- Date: Tue, 23 Apr 2024 08:20:18 GMT
- Title: In industrial embedded software, are some compilation errors easier to localize and fix than others?
- Authors: Han Fu, Sigrid Eldh, Kristian Wiklund, Andreas Ermedahl, Philipp Haller, Cyrille Artho,
- Abstract summary: We collect over 40000 builds from 4 projects from the product source code and categorized compilation errors into 14 error types.
We show that the five most common ones comprise 89 % of all compilation errors.
Our research also provides insights into the human effort required to fix the most common industrial compilation errors.
- Score: 1.627308316856397
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Industrial embedded systems often require specialized hardware. However, software engineers have access to such domain-specific hardware only at the continuous integration (CI) stage and have to use simulated hardware otherwise. This results in a higher proportion of compilation errors at the CI stage than in other types of systems, warranting a deeper study. To this end, we create a CI diagnostics solution called ``Shadow Job'' that analyzes our industrial CI system. We collected over 40000 builds from 4 projects from the product source code and categorized the compilation errors into 14 error types, showing that the five most common ones comprise 89 % of all compilation errors. Additionally, we analyze the resolution time, size, and distance for each error type, to see if different types of compilation errors are easier to localize or repair than others. Our results show that the resolution time, size, and distance are independent of each other. Our research also provides insights into the human effort required to fix the most common industrial compilation errors. We also identify the most promising directions for future research on fault localization.
Related papers
- Evaluating the Capability of LLMs in Identifying Compilation Errors in Configurable Systems [1.2928804566606342]
This study evaluates the efficacy of Large Language Models (LLMs), specifically ChatGPT4, Le Chat Mistral and Gemini Advanced 1.5.
ChatGPT4 successfully identified most compilation errors in individual products.
Le Chat Mistral and Gemini Advanced 1.5 detected some of them.
arXiv Detail & Related papers (2024-07-26T21:07:21Z) - KGym: A Platform and Dataset to Benchmark Large Language Models on Linux Kernel Crash Resolution [59.20933707301566]
Large Language Models (LLMs) are consistently improving at increasingly realistic software engineering (SE) tasks.
In real-world software stacks, significant SE effort is spent developing foundational system software like the Linux kernel.
To evaluate if ML models are useful while developing such large-scale systems-level software, we introduce kGym and kBench.
arXiv Detail & Related papers (2024-07-02T21:44:22Z) - Investigating Memory Failure Prediction Across CPU Architectures [8.477622236186695]
We investigate the correlation between Correctable Errors (CEs) and Uncorrectable Errors (UEs) across different CPU architectures.
Our analysis identifies unique patterns of memory failure associated with each processor platform.
We conduct the memory failure prediction in different processors' platforms, achieving up to 15% improvements in F1-score compared to the existing algorithm.
arXiv Detail & Related papers (2024-06-08T05:10:23Z) - A Comprehensive Library for Benchmarking Multi-class Visual Anomaly Detection [52.228708947607636]
This paper introduces a comprehensive visual anomaly detection benchmark, ADer, which is a modular framework for new methods.
The benchmark includes multiple datasets from industrial and medical domains, implementing fifteen state-of-the-art methods and nine comprehensive metrics.
We objectively reveal the strengths and weaknesses of different methods and provide insights into the challenges and future directions of multi-class visual anomaly detection.
arXiv Detail & Related papers (2024-06-05T13:40:07Z) - DebugBench: Evaluating Debugging Capability of Large Language Models [80.73121177868357]
DebugBench is a benchmark for Large Language Models (LLMs)
It covers four major bug categories and 18 minor types in C++, Java, and Python.
We evaluate two commercial and four open-source models in a zero-shot scenario.
arXiv Detail & Related papers (2024-01-09T15:46:38Z) - The Devil Is in the Command Line: Associating the Compiler Flags With
the Binary and Build Metadata [0.0]
Defects caused by an undesired combination of compiler flags are common in nontrivial software projects.
queryable database of how the compiler compiled and linked the software system will help to detect defects earlier.
arXiv Detail & Related papers (2023-12-20T22:27:32Z) - Guess & Sketch: Language Model Guided Transpilation [59.02147255276078]
Learned transpilation offers an alternative to manual re-writing and engineering efforts.
Probabilistic neural language models (LMs) produce plausible outputs for every input, but do so at the cost of guaranteed correctness.
Guess & Sketch extracts alignment and confidence information from features of the LM then passes it to a symbolic solver to resolve semantic equivalence.
arXiv Detail & Related papers (2023-09-25T15:42:18Z) - Dcc --help: Generating Context-Aware Compiler Error Explanations with
Large Language Models [53.04357141450459]
dcc --help was deployed to our CS1 and CS2 courses, with 2,565 students using the tool over 64,000 times in ten weeks.
We found that the LLM-generated explanations were conceptually accurate in 90% of compile-time and 75% of run-time cases, but often disregarded the instruction not to provide solutions in code.
arXiv Detail & Related papers (2023-08-23T02:36:19Z) - Large-scale Crash Localization using Multi-Task Learning [3.4383679424643456]
We develop a novel multi-task sequence labeling approach for identifying blamed frames in stack traces.
We evaluate our model with over a million real-world crashes from four popular Microsoft applications.
arXiv Detail & Related papers (2021-09-29T10:26:57Z) - OutlierNets: Highly Compact Deep Autoencoder Network Architectures for
On-Device Acoustic Anomaly Detection [77.23388080452987]
Human operators often diagnose industrial machinery via anomalous sounds.
Deep learning-driven anomaly detection methods often require an extensive amount of computational resources which prohibits their deployment in factories.
Here we explore a machine-driven design exploration strategy to create OutlierNets, a family of highly compact deep convolutional autoencoder network architectures.
arXiv Detail & Related papers (2021-03-31T04:09:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.