Leveraging Propagated Infection to Crossfire Mutants
- URL: http://arxiv.org/abs/2411.09846v1
- Date: Thu, 14 Nov 2024 23:31:26 GMT
- Title: Leveraging Propagated Infection to Crossfire Mutants
- Authors: Hang Du, Vijay Krishna Palepu, James A. Jones,
- Abstract summary: When tests are insufficient, each surviving mutant provides an opportunity to improve the test suite.
Many surviving mutants are detectable by simply augmenting existing tests with additional assertions.
We build upon prior research that identifies crossfiring'' opportunities -- tests that coincidentally kill multiple mutants.
- Score: 4.229296050697151
- License:
- Abstract: Mutation testing was proposed to identify weaknesses in test suites by repeatedly generating artificially faulty versions of the software (mutants) and determining if the test suite is sufficient to detect them (kill them). When the tests are insufficient, each surviving mutant provides an opportunity to improve the test suite. We conducted a study and found that many such surviving mutants (up to 84% for the subjects of our study) are detectable by simply augmenting existing tests with additional assertions, or assertion amplification. Moreover, we find that many of these mutants are detectable by multiple existing tests, giving developers options for how to detect them. To help with these challenges, we created a technique that performs memory-state analysis to identify candidate assertions that developers can use to detect the surviving mutants. Additionally, we build upon prior research that identifies ``crossfiring'' opportunities -- tests that coincidentally kill multiple mutants. To this end, we developed a theoretical model that describes the varying granularities that crossfiring can occur in the existing test suite, which provide opportunities and options for how to kill surviving mutants. We operationalize this model to an accompanying technique that optimizes the assertion amplification of the existing tests to crossfire multiple mutants with fewer added assertions, optionally concentrated within fewer tests. Our experiments show that we can kill all surviving mutants that are detectable with existing test data with only 1.1% of the identified assertion candidates, and increasing by a factor of 6x, on average, the number of killed mutants from amplified tests, over tests that do not crossfire.
Related papers
- An Exploratory Study on Using Large Language Models for Mutation Testing [32.91472707292504]
Large Language Models (LLMs) have shown great potential in code-related tasks but their utility in mutation testing remains unexplored.
This paper investigates the performance of LLMs in generating effective mutations to their usability, fault detection potential, and relationship with real bugs.
We find that compared to existing approaches, LLMs generate more diverse mutations that are behaviorally closer to real bugs.
arXiv Detail & Related papers (2024-06-14T08:49:41Z) - An Empirical Evaluation of Manually Created Equivalent Mutants [54.02049952279685]
Less than 10 % of manually created mutants are equivalent.
Surprisingly, our findings indicate that a significant portion of developers struggle to accurately identify equivalent mutants.
arXiv Detail & Related papers (2024-04-14T13:04:10Z) - Contextual Predictive Mutation Testing [17.832774161583036]
We introduce MutationBERT, an approach for predictive mutation testing that simultaneously encodes the source method mutation and test method.
Thanks to its higher precision, MutationBERT saves 33% of the time spent by a prior approach on checking/verifying live mutants.
We validate our input representation, and aggregation approaches for lifting predictions from the test matrix level to the test suite level, finding similar improvements in performance.
arXiv Detail & Related papers (2023-09-05T17:00:15Z) - Systematic Assessment of Fuzzers using Mutation Analysis [20.91546707828316]
In software testing, the gold standard for evaluating test quality is mutation analysis.
mutation analysis subsumes various coverage measures and provides a large and diverse set of faults.
We apply modern mutation analysis techniques that pool multiple mutations and allow us -- for the first time -- to evaluate and compare fuzzers with mutation analysis.
arXiv Detail & Related papers (2022-12-06T15:47:47Z) - Statistical and Computational Phase Transitions in Group Testing [73.55361918807883]
We study the group testing problem where the goal is to identify a set of k infected individuals carrying a rare disease.
We consider two different simple random procedures for assigning individuals tests.
arXiv Detail & Related papers (2022-06-15T16:38:50Z) - Sequential Permutation Testing of Random Forest Variable Importance
Measures [68.8204255655161]
It is proposed here to use sequential permutation tests and sequential p-value estimation to reduce the high computational costs associated with conventional permutation tests.
The results of simulation studies confirm that the theoretical properties of the sequential tests apply.
The numerical stability of the methods is investigated in two additional application studies.
arXiv Detail & Related papers (2022-06-02T20:16:50Z) - Efficient Test-Time Model Adaptation without Forgetting [60.36499845014649]
Test-time adaptation seeks to tackle potential distribution shifts between training and testing data.
We propose an active sample selection criterion to identify reliable and non-redundant samples.
We also introduce a Fisher regularizer to constrain important model parameters from drastic changes.
arXiv Detail & Related papers (2022-04-06T06:39:40Z) - DeepMetis: Augmenting a Deep Learning Test Set to Increase its Mutation
Score [4.444652484439581]
tool is effective at augmenting the given test set, increasing its capability to detect mutants by 63% on average.
A leave-one-out experiment shows that the augmented test set is capable of exposing unseen mutants.
arXiv Detail & Related papers (2021-09-15T18:20:50Z) - An Exploration of Exploration: Measuring the ability of lexicase
selection to find obscure pathways to optimality [62.997667081978825]
We introduce an "exploration diagnostic" that diagnoses a selection scheme's capacity for search space exploration.
We verify that lexicase selection out-explores tournament selection.
We find that relaxing lexicase's elitism with epsilon lexicase can further improve exploration.
arXiv Detail & Related papers (2021-07-20T20:43:06Z) - Noisy Adaptive Group Testing using Bayesian Sequential Experimental
Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually.
Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.