Related papers: QuCheck: A Property-based Testing Framework for Quantum Programs in Qiskit

QuCheck: A Property-based Testing Framework for Quantum Programs in Qiskit

URL: http://arxiv.org/abs/2503.22641v1
Date: Fri, 28 Mar 2025 17:30:09 GMT
Title: QuCheck: A Property-based Testing Framework for Quantum Programs in Qiskit
Authors: Gabriel Pontolillo, Mohammad Reza Mousavi, Marek Grzesiuk,
Abstract summary: Property-based testing has been previously proposed for quantum programs in Q# with QSharpCheck.<n>We propose QuCheck, an enhanced property-based testing framework in Qiskit.
Score: 0.5735035463793009
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Property-based testing has been previously proposed for quantum programs in Q# with QSharpCheck; however, this implementation was limited in functionality, lacked extensibility, and was evaluated on a narrow range of programs using a single property. To address these limitations, we propose QuCheck, an enhanced property-based testing framework in Qiskit. By leveraging Qiskit and the broader Python ecosystem, QuCheck facilitates property construction, introduces flexible input generators and assertions, and supports expressive preconditions. We assessed its effectiveness through mutation analysis on five quantum programs (2-10 qubits), varying the number of properties, inputs, and measurement shots to assess their impact on fault detection and demonstrate the effectiveness of property-based testing across a range of conditions. Results show a strong positive correlation between the mutation score (a measure of fault detection) and number of properties evaluated, with a moderate negative correlation between the false positive rate and number of measurement shots. Among the most thorough test configurations, those evaluating three properties achieved a mean mutation score ranging from 0.90 to 0.92 across all five algorithms, with the false positive rate between 0 and 0.04. QuCheck identified 36.0% more faults than QSharpCheck, with execution time reduced by 81.1%, despite one false positive. These findings underscore the viability of property-based testing for verifying quantum systems.

Related papers

Detecting Flaky Tests in Quantum Software: A Dynamic Approach [4.46640294257026]
Flaky tests that pass or fail nondeterministically without changes to code or environment pose a serious threat to software reliability.<n>This paper presents the first large-scale dynamic characterization of flaky tests in quantum software.<n>We executed the Qiskit Terra test suite 10,000 times across 23 releases in controlled environments.
arXiv Detail & Related papers (2025-12-19T21:47:31Z)
QSentry: Backdoor Detection for Quantum Neural Networks via Measurement Clustering [43.44248599606903]
Quantum neural networks (QNNs) are an important model for implementing quantum machine learning (QML)<n>This work establishes a practical and effective framework for mitigating backdoor threats in QML.
arXiv Detail & Related papers (2025-11-19T12:08:11Z)
ORFuzz: Fuzzing the "Other Side" of LLM Safety -- Testing Over-Refusal [27.26251627767238]
Large Language Models (LLMs) increasingly exhibit over-refusal - erroneously rejecting benign queries due to overly conservative safety measures.<n>This paper introduces the first evolutionary testing framework, ORFuzz, for the systematic detection and analysis of LLM over-refusals.
arXiv Detail & Related papers (2025-08-15T05:03:26Z)
Intention-Driven Generation of Project-Specific Test Cases [45.2380093475221]
We propose IntentionTest, which generates project-specific tests given the description of validation intention.<n>We extensively evaluate IntentionTest against state-of-the-art baselines (DA, ChatTester, and EvoSuite) on 4,146 test cases from 13 open-source projects.
arXiv Detail & Related papers (2025-07-28T08:35:04Z)
Calibration of Quantum Devices via Robust Statistical Methods [45.464983015777314]
We numerically analyze advanced statistical methods for Bayesian inference against the state-of-the-art in quantum parameter learning.<n>We show advantages of these approaches over existing ones, namely under multi-modality and high dimensionality.<n>Our findings have applications in challenging quantumcharacterization tasks namely learning the dynamics of open quantum systems.
arXiv Detail & Related papers (2025-07-09T15:22:17Z)
DS@GT at CheckThat! 2025: Evaluating Context and Tokenization Strategies for Numerical Fact Verification [49.1574468325115]
Numerical claims, statements involving quantities, comparisons, and temporal references pose unique challenges for automated fact-checking systems.<n>We evaluate modeling strategies for veracity prediction of such claims using the QuanTemp dataset and building our own evidence retrieval pipeline.<n>Our best-performing system achieves competitive macro-average F1 score of 0.57 and places us among the Top-4 submissions in Task 3 of CheckThat! 2025.
arXiv Detail & Related papers (2025-07-08T17:22:22Z)
Bloch Vector Assertions for Debugging Quantum Programs [3.8028747063484594]
Bloq is a scalable, automated fault localization approach.<n>We introduce AutoBloq, a component of Bloq for automatically generating assertion schemes from quantum algorithms.
arXiv Detail & Related papers (2025-06-23T09:53:02Z)
Bug Classification in Quantum Software: A Rule-Based Framework and Its Evaluation [1.1510009152620668]
This paper presents an automated framework for classifying issues in quantum software repositories by bug type, category, severity, and impacted quality attributes.<n>The framework achieves up to 85.21% accuracy, with F1-scores ranging from 0.7075 (severity) to 0.8393 (quality attribute)<n>A review of 1,550 quantum-specific bugs showed that over half involved quantum circuit-level problems, followed by gate errors and hardware-related issues.
arXiv Detail & Related papers (2025-06-12T06:42:10Z)
VALTEST: Automated Validation of Language Model Generated Test Cases [0.7059472280274008]
Large Language Models (LLMs) have demonstrated significant potential in automating software testing, specifically in generating unit test cases. This paper introduces VALTEST, a novel framework designed to automatically validate test cases generated by LLMs by leveraging token probabilities.
arXiv Detail & Related papers (2024-11-13T00:07:32Z)
Automating Quantum Software Maintenance: Flakiness Detection and Root Cause Analysis [4.554856650068748]
Flaky tests, which pass or fail inconsistently without code changes, are a major challenge in software engineering. We aim to create an automated framework to detect flaky tests in quantum software.
arXiv Detail & Related papers (2024-10-31T02:43:04Z)
QuanTest: Entanglement-Guided Testing of Quantum Neural Network Systems [45.18451374144537]
Quantum Neural Network (QNN) combines the Deep Learning (DL) principle with the fundamental theory of quantum mechanics to achieve machine learning tasks with quantum acceleration. QNN systems differ significantly from traditional quantum software and classical DL systems, posing critical challenges for QNN testing. We propose QuanTest, a quantum entanglement-guided adversarial testing framework to uncover potential erroneous behaviors in QNN systems.
arXiv Detail & Related papers (2024-02-20T12:11:28Z)
Calibrated Uncertainty Quantification for Operator Learning via Conformal Prediction [95.75771195913046]
We propose a risk-controlling quantile neural operator, a distribution-free, finite-sample functional calibration conformal prediction method. We provide a theoretical calibration guarantee on the coverage rate, defined as the expected percentage of points on the function domain. Empirical results on a 2D Darcy flow and a 3D car surface pressure prediction task validate our theoretical results.
arXiv Detail & Related papers (2024-02-02T23:43:28Z)
Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity. An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z)
Contextual Predictive Mutation Testing [17.832774161583036]
We introduce MutationBERT, an approach for predictive mutation testing that simultaneously encodes the source method mutation and test method. Thanks to its higher precision, MutationBERT saves 33% of the time spent by a prior approach on checking/verifying live mutants. We validate our input representation, and aggregation approaches for lifting predictions from the test matrix level to the test suite level, finding similar improvements in performance.
arXiv Detail & Related papers (2023-09-05T17:00:15Z)
Effective Test Generation Using Pre-trained Large Language Models and Mutation Testing [13.743062498008555]
We introduce MuTAP for improving the effectiveness of test cases generated by Large Language Models (LLMs) in terms of revealing bugs. MuTAP is capable of generating effective test cases in the absence of natural language descriptions of the Program Under Test (PUTs) Our results show that our proposed method is able to detect up to 28% more faulty human-written code snippets.
arXiv Detail & Related papers (2023-08-31T08:48:31Z)
From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing. This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time. We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z)
An ensemble meta-estimator to predict source code testability [1.4213973379473652]
The size of a test suite determines the test effort and cost, while the coverage measure indicates the test effectiveness. This paper offers a new equation to estimate testability regarding the size and coverage of a given test suite.
arXiv Detail & Related papers (2022-08-20T06:18:16Z)
Statistical and Computational Phase Transitions in Group Testing [73.55361918807883]
We study the group testing problem where the goal is to identify a set of k infected individuals carrying a rare disease. We consider two different simple random procedures for assigning individuals tests.
arXiv Detail & Related papers (2022-06-15T16:38:50Z)
Measuring NISQ Gate-Based Qubit Stability Using a 1+1 Field Theory and Cycle Benchmarking [50.8020641352841]
We study coherent errors on a quantum hardware platform using a transverse field Ising model Hamiltonian as a sample user application. We identify inter-day and intra-day qubit calibration drift and the impacts of quantum circuit placement on groups of qubits in different physical locations on the processor. This paper also discusses how these measurements can provide a better understanding of these types of errors and how they may improve efforts to validate the accuracy of quantum computations.
arXiv Detail & Related papers (2022-01-08T23:12:55Z)
Deconfounding Scores: Feature Representations for Causal Effect Estimation with Weak Overlap [140.98628848491146]
We introduce deconfounding scores, which induce better overlap without biasing the target of estimation. We show that deconfounding scores satisfy a zero-covariance condition that is identifiable in observed data. In particular, we show that this technique could be an attractive alternative to standard regularizations.
arXiv Detail & Related papers (2021-04-12T18:50:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.