Related papers: Statistical-Based Metric Threshold Setting Method for Software Fault Prediction in Firmware Projects: An Industrial Experience

Statistical-Based Metric Threshold Setting Method for Software Fault Prediction in Firmware Projects: An Industrial Experience

URL: http://arxiv.org/abs/2602.06831v1
Date: Fri, 06 Feb 2026 16:19:36 GMT
Title: Statistical-Based Metric Threshold Setting Method for Software Fault Prediction in Firmware Projects: An Industrial Experience
Authors: Marco De Luca, Domenico Amalfitano, Anna Rita Fasolino, Porfirio Tramontana,
Abstract summary: Machine learning-based fault prediction models have demonstrated high accuracy, but their lack of interpretability limits their adoption in industrial settings.<n>We present a structured process for defining context-specific software metric thresholds suitable for integration into fault detection in industrial settings.<n>Our approach supports cross-project fault prediction by deriving thresholds from one set of projects and applying them to independently developed firmware.
Score: 4.339839287869652
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Ensuring software quality in embedded firmware is critical, especially in safety-critical domains where compliance with functional safety standards (ISO 26262) requires strong guarantees of software reliability. While machine learning-based fault prediction models have demonstrated high accuracy, their lack of interpretability limits their adoption in industrial settings. Developers need actionable insights that can be directly employed in software quality assurance processes and guide defect mitigation strategies. In this paper, we present a structured process for defining context-specific software metric thresholds suitable for integration into fault detection workflows in industrial settings. Our approach supports cross-project fault prediction by deriving thresholds from one set of projects and applying them to independently developed firmware, thereby enabling reuse across similar software systems without retraining or domain-specific tuning. We analyze three real-world C-embedded firmware projects provided by an industrial partner, using Coverity and Understand static analysis tools to extract software metrics. Through statistical analysis and hypothesis testing, we identify discriminative metrics and derived empirical threshold values capable of distinguishing faulty from non-faulty functions. The derived thresholds are validated through an experimental evaluation, demonstrating their effectiveness in identifying fault-prone functions with high precision. The results confirm that the derived thresholds can serve as an interpretable solution for fault prediction, aligning with industry standards and SQA practices. This approach provides a practical alternative to black-box AI models, allowing developers to systematically assess software quality, take preventive actions, and integrate metric-based fault prediction into industrial development workflows to mitigate software faults.

Related papers

Towards Comprehensive Benchmarking Infrastructure for LLMs In Software Engineering [19.584762693453893]
BEHELM is a holistic benchmarking infrastructure that unifies software-scenario specification with multi-metric evaluation.<n>Our goal is to reduce the overhead currently required to construct benchmarks while enabling a fair, realistic, and future-proof assessment of LLMs in software engineering.
arXiv Detail & Related papers (2026-01-28T21:55:10Z)
Towards Verifiably Safe Tool Use for LLM Agents [53.55621104327779]
Large language model (LLM)-based AI agents extend capabilities by enabling access to tools such as data sources, APIs, search engines, code sandboxes, and even other agents.<n>LLMs may invoke unintended tool interactions and introduce risks, such as leaking sensitive data or overwriting critical records.<n>Current approaches to mitigate these risks, such as model-based safeguards, enhance agents' reliability but cannot guarantee system safety.
arXiv Detail & Related papers (2026-01-12T21:31:38Z)
Reliable LLM-Based Edge-Cloud-Expert Cascades for Telecom Knowledge Systems [54.916243942641444]
Large language models (LLMs) are emerging as key enablers of automation in domains such as telecommunications.<n>We study an edge-cloud-expert cascaded LLM-based knowledge system that supports decision-making through a question-and-answer pipeline.
arXiv Detail & Related papers (2025-12-23T03:10:09Z)
OSS-UAgent: An Agent-based Usability Evaluation Framework for Open Source Software [47.02288620982592]
Our framework employs intelligent agents powered by large language models (LLMs) to simulate developers performing programming tasks.<n> OSS-UAgent ensures accurate and context-aware code generation.<n>Our demonstration showcases OSS-UAgent's practical application in evaluating graph analytics platforms.
arXiv Detail & Related papers (2025-05-29T08:40:10Z)
Training Language Models to Generate Quality Code with Program Analysis Feedback [66.0854002147103]
Code generation with large language models (LLMs) is increasingly adopted in production but fails to ensure code quality.<n>We propose REAL, a reinforcement learning framework that incentivizes LLMs to generate production-quality code.
arXiv Detail & Related papers (2025-05-28T17:57:47Z)
Object Oriented-Based Metrics to Predict Fault Proneness in Software Design [1.194799054956877]
We look at the relationship between object-oriented software metrics and their implications on fault proneness.<n>Studies indicate that object-oriented metrics are indeed a good predictor of software fault proneness.
arXiv Detail & Related papers (2025-04-11T03:29:15Z)
Enhancing LLM Reliability via Explicit Knowledge Boundary Modeling [41.19330514054401]
Large language models (LLMs) are prone to hallucination stemming from misaligned self-awareness.<n>We propose the Explicit Knowledge Boundary Modeling framework to integrate fast and slow reasoning systems to harmonize reliability and usability.
arXiv Detail & Related papers (2025-03-04T03:16:02Z)
A Hybrid Framework for Statistical Feature Selection and Image-Based Noise-Defect Detection [55.2480439325792]
This paper presents a hybrid framework that integrates both statistical feature selection and classification techniques to improve defect detection accuracy.<n>We present around 55 distinguished features that are extracted from industrial images, which are then analyzed using statistical methods.<n>By integrating these methods with flexible machine learning applications, the proposed framework improves detection accuracy and reduces false positives and misclassifications.
arXiv Detail & Related papers (2024-12-11T22:12:21Z)
Outside the Comfort Zone: Analysing LLM Capabilities in Software Vulnerability Detection [9.652886240532741]
This paper thoroughly analyses large language models' capabilities in detecting vulnerabilities within source code. We evaluate the performance of six open-source models that are specifically trained for vulnerability detection against six general-purpose LLMs.
arXiv Detail & Related papers (2024-08-29T10:00:57Z)
Multi Agent System for Machine Learning Under Uncertainty in Cyber Physical Manufacturing System [78.60415450507706]
Recent advancements in predictive machine learning has led to its application in various use cases in manufacturing. Most research focused on maximising predictive accuracy without addressing the uncertainty associated with it. In this paper, we determine the sources of uncertainty in machine learning and establish the success criteria of a machine learning system to function well under uncertainty.
arXiv Detail & Related papers (2021-07-28T10:28:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.