Related papers: Loop unrolling: formal definition and application to testing

Loop unrolling: formal definition and application to testing

URL: http://arxiv.org/abs/2502.15535v1
Date: Fri, 21 Feb 2025 15:36:21 GMT
Title: Loop unrolling: formal definition and application to testing
Authors: Li Huang, Bertrand Meyer, Reto Weber,
Abstract summary: Testing processes usually aim at high coverage, but loops severely limit coverage ambitions since the number of iterations is generally not predictable.<n>This article provides a formal definition and a set of formal properties of unrolling.<n>Using this definition as the conceptual basis, we have applied an unrolling strategy to an existing automated testing framework.
Score: 33.432652829284244
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Testing processes usually aim at high coverage, but loops severely limit coverage ambitions since the number of iterations is generally not predictable. Most testing teams address this issue by adopting the extreme solution of limiting themselves to branch coverage, which only considers loop executions that iterate the body either once or not at all. This approach misses any bug that only arises after two or more iterations. To achieve more meaningful coverage, testing strategies may unroll loops, in the sense of using executions that iterate loops up to n times for some n greater than one, chosen pragmatically in consideration of the available computational power. While loop unrolling is a standard part of compiler optimization techniques, its use in testing is far less common. Part of the reason is that the concept, while seemingly intuitive, lacks a generally accepted and precise specification. The present article provides a formal definition and a set of formal properties of unrolling. All the properties have mechanically been proved correct (through the Isabelle proof assistant). Using this definition as the conceptual basis, we have applied an unrolling strategy to an existing automated testing framework and report the results: how many more bugs get detected once we unroll loops more than once? These results provide a first assessment of whether unrolling should become a standard part of test generation and test coverage measurement.

Related papers

Efficient Dynamic Test Case Generation for Path-Based Coverage Criteria [2.099922236065961]
We present a novel approach to test-case generation that satisfies four white-box, path-based coverage criteria.<n>Our method builds on a modified version of Johnson algorithm and enables test cases to be generated incrementally and on demand.
arXiv Detail & Related papers (2026-02-21T09:26:23Z)
How Many Code and Test Cases Are Enough? Evaluating Test Cases Generation from a Binary-Matrix Perspective [51.30005925128432]
evaluating test cases automatically generated by Large Language Models (LLMs) is a critical yet challenging task.<n>Existing benchmarks suffer from high computational costs, score inflation, and a bias towards trivial bugs over rare, critical faults.<n>We introduce a framework that formalizes benchmark construction as finding an optimal diagnostic basis in a binary code-test matrix.
arXiv Detail & Related papers (2025-10-09T18:29:24Z)
Studying the Impact of Early Test Termination Due to Assertion Failure on Code Coverage and Spectrum-based Fault Localization [48.22524837906857]
This study is the first empirical study on early test termination due to assertion failure. We investigated 207 versions of 6 open-source projects. Our findings indicate that early test termination harms both code coverage and the effectiveness of spectrum-based fault localization.
arXiv Detail & Related papers (2025-04-06T17:14:09Z)
Sample, Don't Search: Rethinking Test-Time Alignment for Language Models [55.2480439325792]
We introduce QAlign, a new test-time alignment approach. As we scale test-time compute, QAlign converges to sampling from the optimal aligned distribution for each individual prompt. By adopting recent advances in Markov chain Monte Carlo for text generation, our method enables better-aligned outputs without modifying the underlying model or even requiring logit access.
arXiv Detail & Related papers (2025-04-04T00:41:40Z)
Simple and Provable Scaling Laws for the Test-Time Compute of Large Language Models [70.07661254213181]
We propose two principled algorithms for the test-time compute of large language models.<n>We prove theoretically that the failure probability of one algorithm decays to zero exponentially as its test-time compute grows.
arXiv Detail & Related papers (2024-11-29T05:29:47Z)
A formal definition of loop unrolling with applications to test coverage [37.48416208168878]
Techniques to achieve various forms of test coverage, such as branch coverage, typically do not iterate loops. More recent work has shown that by "unrolling" loops the approach can find significantly more bugs.
arXiv Detail & Related papers (2024-03-13T19:28:04Z)
Precise Error Rates for Computationally Efficient Testing [75.63895690909241]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity. An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z)
Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures. We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z)
Test2Vec: An Execution Trace Embedding for Test Case Prioritization [12.624724734296342]
Execution traces of test cases can be a good alternative to abstract their behavior for automated testing tasks. We propose a novel embedding approach, Test2Vec, that maps test execution traces to a latent space. Results show that our proposed TP improves best alternatives by 41.80% in terms of the median normalized rank of the first failing test case.
arXiv Detail & Related papers (2022-06-28T20:38:36Z)
A Parallel Implementation of Computing Mean Average Precision [0.130536490219656]
Mean Average Precision (mAP) has been widely used for evaluating the quality of object detectors. Current implementations can only count true positives (TP's) and false positives (FP's) for one class at a time. We propose a parallelized alternative that can process mini-batches of detected bounding boxes.
arXiv Detail & Related papers (2022-06-19T23:23:52Z)
Frustratingly Simple Few-Shot Object Detection [98.42824677627581]
We find that fine-tuning only the last layer of existing detectors on rare classes is crucial to the few-shot object detection task. Such a simple approach outperforms the meta-learning methods by roughly 220 points on current benchmarks.
arXiv Detail & Related papers (2020-03-16T00:29:14Z)
Genetic Algorithms for Redundancy in Interaction Testing [0.6396288020763143]
Interaction testing involves designing a suite of tests, which guarantees to detect a fault if one exists among a small number of components interacting together. Existing algorithms for constructing these test suites usually involve one "fast" algorithm for generating most of the tests, and another "slower" algorithm to "complete" the test suite. We employ a genetic algorithm that generalizes these approaches that also incorporates redundancy by increasing the number of algorithms chosen, which we call "stages"
arXiv Detail & Related papers (2020-02-13T10:16:46Z)

This list is automatically generated from the titles and abstracts of the papers in this site.