Characterizing Bugs and Quality Attributes in Quantum Software: A Large-Scale Empirical Study
- URL: http://arxiv.org/abs/2512.24656v2
- Date: Fri, 02 Jan 2026 11:33:39 GMT
- Title: Characterizing Bugs and Quality Attributes in Quantum Software: A Large-Scale Empirical Study
- Authors: Mir Mohammad Yousuf, Shabir Ahmad Sofi,
- Abstract summary: This study presents the first ecosystem-scale longitudinal analysis of software bugs across 123 open source quantum repositories from 2012 to 2024.<n>Full-stack libraries and compilers are the most bug-prone categories due to circuit, gate, and transpilation-related issues.<n>High-severity bugs cluster in cryptography, experimental computing, and compiler toolchains.
- Score: 0.6445605125467574
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantum Software Engineering (QSE) is essential for ensuring the reliability and maintainability of hybrid quantum-classical systems, yet empirical evidence on how bugs emerge and affect quality in real-world quantum projects remains limited. This study presents the first ecosystem-scale longitudinal analysis of software bugs across 123 open source quantum repositories from 2012 to 2024, spanning eight functional categories, including full-stack libraries, simulators, annealing, algorithms, compilers, assembly, cryptography, and experimental computing. Using a mixed method approach combining repository mining, static code analysis, issue metadata extraction, and a validated rule-based classification framework, we analyze 32,296 verified bug reports. Results show that full-stack libraries and compilers are the most bug-prone categories due to circuit, gate, and transpilation-related issues, while simulators are mainly affected by measurement and noise modeling errors. Classical bugs primarily impact usability and interoperability, whereas quantum-specific bugs disproportionately degrade performance, maintainability, and reliability. Longitudinal analysis indicates ecosystem maturation, with bug densities peaking between 2017 and 2021 and declining thereafter. High-severity bugs cluster in cryptography, experimental computing, and compiler toolchains. Repositories employing automated testing detect more bugs and resolve issues faster. A negative binomial regression further shows that automated testing is associated with an approximate 60 percent reduction in expected bug incidence. Overall, this work provides the first large-scale data-driven characterization of quantum software bugs and offers empirical guidance for improving testing, documentation, and maintainability practices in QSE.
Related papers
- Outrunning LLM Cutoffs: A Live Kernel Crash Resolution Benchmark for All [57.23434868678603]
Live-kBench is an evaluation framework for self-evolving benchmarks that scrapes and evaluates agents on freshly discovered kernel bugs.<n> kEnv is an agent-agnostic crash-resolution environment for kernel compilation, execution, and feedback.<n>Using kEnv, we benchmark three state-of-the-art agents, showing that they resolve 74% of crashes on the first attempt.
arXiv Detail & Related papers (2026-02-02T19:06:15Z) - QEF: Reproducible and Exploratory Quantum Software Experiments [1.1683938179815823]
Quantum Experiment Framework (QEF) is designed to support the systematic, hypothesis-driven study of quantum algorithms.<n>QEF captures all key aspects of quantum software and algorithm experiments through a concise specification.<n>QEF supports parameter reuse to improve overall experiment runtimes.
arXiv Detail & Related papers (2025-11-06T17:17:55Z) - BugPilot: Complex Bug Generation for Efficient Learning of SWE Skills [59.003563837981886]
High quality bugs are key to training the next generation of language model based software engineering (SWE) agents.<n>We introduce a novel method for synthetic generation of difficult and diverse bugs.
arXiv Detail & Related papers (2025-10-22T17:58:56Z) - Hybrid Quantum-Classical Neural Networks for Few-Shot Credit Risk Assessment [52.05742536403784]
This work tackles the challenge of few-shot credit risk assessment.<n>We design and implement a novel hybrid quantum-classical workflow.<n>A Quantum Neural Network (QNN) was trained via the parameter-shift rule.<n>On a real-world credit dataset of 279 samples, our QNN achieved a robust average AUC of 0.852 +/- 0.027 in simulations and yielded an impressive AUC of 0.88 in the hardware experiment.
arXiv Detail & Related papers (2025-09-17T08:36:05Z) - Empirical Analysis of Temporal and Spatial Fault Characteristics in Multi-Fault Bug Repositories [45.208325853591475]
We present an empirical analysis of the temporal and spatial characteristics of faults existing in 16 open-source Java and Python projects.<n>Our findings show that many faults in these software systems are long-lived, leading to the majority of software versions having multiple coexisting faults.
arXiv Detail & Related papers (2025-08-12T11:55:16Z) - BugScope: Learn to Find Bugs Like Human [9.05553442116139]
BugScope emulates how human auditors learn new bug patterns from representative examples and apply that knowledge during code auditing.<n>Our evaluation on a dataset of 40 real-world bugs drawn from 21 widely-used open-source projects demonstrates that BugScope achieves 87.04% precision.<n>Further testing on large-scale open-source systems, including the Linux kernel, uncovered 141 previously unknown bugs.
arXiv Detail & Related papers (2025-07-21T14:34:01Z) - Challenges and Practices in Quantum Software Testing and Debugging: Insights from Practitioners [7.856941186056147]
As quantum computing transitions from theory to implementation, developers face issues not present in classical software development.<n>We surveyed 26 quantum software developers from academia and industry.<n>Only 31% reported using quantum-specific testing tools, relying instead on manual methods.
arXiv Detail & Related papers (2025-06-18T02:52:37Z) - Bug Classification in Quantum Software: A Rule-Based Framework and Its Evaluation [1.1510009152620668]
This paper presents an automated framework for classifying issues in quantum software repositories by bug type, category, severity, and impacted quality attributes.<n>The framework achieves up to 85.21% accuracy, with F1-scores ranging from 0.7075 (severity) to 0.8393 (quality attribute)<n>A review of 1,550 quantum-specific bugs showed that over half involved quantum circuit-level problems, followed by gate errors and hardware-related issues.
arXiv Detail & Related papers (2025-06-12T06:42:10Z) - The Impact of Software Testing with Quantum Optimization Meets Machine Learning [0.4779196219827508]
This research presents a hybrid framework integrating Quantum Annealing with ML to optimize test case prioritization in CI/CD pipelines.<n>It achieves a 25 percent increase in defect detection efficiency and a 30 percent reduction in test execution time versus classical ML.<n>The framework addresses quantum hardware limits, CI/CD integration, and scalability for 2025s hybrid quantum-classical ecosystems.
arXiv Detail & Related papers (2025-06-02T15:04:10Z) - Bug Destiny Prediction in Large Open-Source Software Repositories through Sentiment Analysis and BERT Topic Modeling [3.481985817302898]
We leverage features available before a bug is resolved to enhance predictive accuracy.<n>Our methodology incorporates sentiment analysis to derive both an emotionality score and a sentiment classification.<n>Results demonstrate that sentiment analysis serves as a valuable predictor of a bug's eventual outcome.
arXiv Detail & Related papers (2025-04-22T15:18:14Z) - Comparative Analysis of Quantum and Classical Support Vector Classifiers for Software Bug Prediction: An Exploratory Study [8.214986715680737]
We investigate the practical application of Quantum Support Vectors (QSVC) for detecting buggy software commits.<n>Our technique addresses large datasets in QSVC algorithms by dividing them into smaller subsets.<n>We propose an aggregation method to combine predictions from these models to detect the entire test dataset.
arXiv Detail & Related papers (2025-01-08T18:53:50Z) - Leveraging Large Language Models for Efficient Failure Analysis in Game Development [47.618236610219554]
This paper proposes a new approach to automatically identify which change in the code caused a test to fail.
The method leverages Large Language Models (LLMs) to associate error messages with the corresponding code changes causing the failure.
Our approach reaches an accuracy of 71% in our newly created dataset, which comprises issues reported by developers at EA over a period of one year.
arXiv Detail & Related papers (2024-06-11T09:21:50Z) - Quantum Patch-Based Autoencoder for Anomaly Segmentation [44.99833362998488]
We introduce a patch-based quantum autoencoder (QPB-AE) for image anomaly segmentation.
QPB-AE reconstructs the quantum state of the embedded input patches, computing an anomaly map directly from measurement.
We evaluate its performance across multiple datasets and parameter configurations.
arXiv Detail & Related papers (2024-04-26T08:42:58Z) - Problem-Dependent Power of Quantum Neural Networks on Multi-Class
Classification [83.20479832949069]
Quantum neural networks (QNNs) have become an important tool for understanding the physical world, but their advantages and limitations are not fully understood.
Here we investigate the problem-dependent power of QCs on multi-class classification tasks.
Our work sheds light on the problem-dependent power of QNNs and offers a practical tool for evaluating their potential merit.
arXiv Detail & Related papers (2022-12-29T10:46:40Z) - An Empirical Study on Bug Severity Estimation using Source Code Metrics and Static Analysis [0.8621608193534838]
We study 3,358 buggy methods with different severity labels from 19 Java open-source projects.
Results show that code metrics are useful in predicting buggy code, but they cannot estimate the severity level of the bugs.
Our categorization shows that Security bugs have high severity in most cases while Edge/Boundary faults have low severity.
arXiv Detail & Related papers (2022-06-26T17:07:23Z) - Provable tradeoffs in adversarially robust classification [96.48180210364893]
We develop and leverage new tools, including recent breakthroughs from probability theory on robust isoperimetry.
Our results reveal fundamental tradeoffs between standard and robust accuracy that grow when data is imbalanced.
arXiv Detail & Related papers (2020-06-09T09:58:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.