Related papers: MuRS: Mutant Ranking and Suppression using Identifier Templates

MuRS: Mutant Ranking and Suppression using Identifier Templates

URL: http://arxiv.org/abs/2306.09130v1
Date: Thu, 15 Jun 2023 13:43:52 GMT
Title: MuRS: Mutant Ranking and Suppression using Identifier Templates
Authors: Zimin Chen, Malgorzata Salawa, Manushree Vijayvergiya, Goran Petrovic, Marko Ivankovic and Rene Just
Abstract summary: Google's mutation testing service integrates diff-based mutation testing into the code review process. Google's mutation testing service implements a number of suppression rules, which target not-useful mutants. This paper proposes and evaluates MuRS, an automated approach that groups mutants by patterns in the source code under test.
Score: 4.9205581820379765
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Diff-based mutation testing is a mutation testing approach that only mutates lines affected by a code change under review. Google's mutation testing service integrates diff-based mutation testing into the code review process and continuously gathers developer feedback on mutants surfaced during code review. To enhance the developer experience, the mutation testing service implements a number of suppression rules, which target not-useful mutants-that is, mutants that have consistently received negative developer feedback. However, while effective, manually implementing suppression rules require significant engineering time. An automatic system to rank and suppress mutants would facilitate the maintenance of the mutation testing service. This paper proposes and evaluates MuRS, an automated approach that groups mutants by patterns in the source code under test and uses these patterns to rank and suppress future mutants based on historical developer feedback on mutants in the same group. To evaluate MuRS, we conducted an A/B testing study, comparing MuRS to the existing mutation testing service. Despite the strong baseline, which uses manually developed suppression rules, the results show a statistically significantly lower negative feedback ratio of 11.45% for MuRS versus 12.41% for the baseline. The results also show that MuRS is able to recover existing suppression rules implemented in the baseline. Finally, the results show that statement-deletion mutant groups received both the most positive and negative developer feedback, suggesting a need for additional context that can distinguish between useful and not-useful mutants in these groups. Overall, MuRS has the potential to substantially reduce the development and maintenance cost for an effective mutation testing service by automatically learning suppression rules.

Related papers

On Mutation-Guided Unit Test Generation [9.938579776227506]
MUTGEN is a mutation-guided, LLM-based test generation approach.<n>It significantly outperforms both EvoSuite and vanilla prompt-based strategies in terms of mutation score.
arXiv Detail & Related papers (2025-06-03T14:47:22Z)
A Simple yet Effective DDG Predictor is An Unsupervised Antibody Optimizer and Explainer [53.85265022754878]
We propose a lightweight DDG predictor (Light-DDG) for fast mutation screening. We also release a large-scale dataset containing millions of mutation data for pre-training Light-DDG. For the target antibody, we propose a novel Mutation Explainer to learn mutation preferences.
arXiv Detail & Related papers (2025-02-10T09:26:57Z)
METFORD -- Mutation tEsTing Framework fOR anDroid [0.0]
This research aims to contribute to reducing Android mutation testing costs. It implements mutation testing operators according to mutant schemata. Additional mutation operators can be implemented in JavaScript and easily integrated into the framework.
arXiv Detail & Related papers (2025-01-06T09:36:57Z)
Improving Bias Correction Standards by Quantifying its Effects on Treatment Outcomes [54.18828236350544]
Propensity score matching (PSM) addresses selection biases by selecting comparable populations for analysis. Different matching methods can produce significantly different Average Treatment Effects (ATE) for the same task, even when meeting all validation criteria. To address this issue, we introduce a novel metric, A2A, to reduce the number of valid matches.
arXiv Detail & Related papers (2024-07-20T12:42:24Z)
An Empirical Evaluation of Manually Created Equivalent Mutants [54.02049952279685]
Less than 10 % of manually created mutants are equivalent. Surprisingly, our findings indicate that a significant portion of developers struggle to accurately identify equivalent mutants.
arXiv Detail & Related papers (2024-04-14T13:04:10Z)
AMRFact: Enhancing Summarization Factuality Evaluation with AMR-Driven Negative Samples Generation [57.8363998797433]
We propose AMRFact, a framework that generates perturbed summaries using Abstract Meaning Representations (AMRs) Our approach parses factually consistent summaries into AMR graphs and injects controlled factual inconsistencies to create negative examples, allowing for coherent factually inconsistent summaries to be generated with high error-type coverage.
arXiv Detail & Related papers (2023-11-16T02:56:29Z)
Contextual Predictive Mutation Testing [17.832774161583036]
We introduce MutationBERT, an approach for predictive mutation testing that simultaneously encodes the source method mutation and test method. Thanks to its higher precision, MutationBERT saves 33% of the time spent by a prior approach on checking/verifying live mutants. We validate our input representation, and aggregation approaches for lifting predictions from the test matrix level to the test suite level, finding similar improvements in performance.
arXiv Detail & Related papers (2023-09-05T17:00:15Z)
Mutation Testing of Deep Reinforcement Learning Based on Real Faults [11.584571002297217]
This paper builds on the existing approach of Mutation Testing (MT) to extend it to Reinforcement Learning (RL) systems. We show that the design choice of the mutation killing definition can affect whether or not a mutation is killed as well as the generated test cases.
arXiv Detail & Related papers (2023-01-13T16:45:56Z)
Systematic Assessment of Fuzzers using Mutation Analysis [20.91546707828316]
In software testing, the gold standard for evaluating test quality is mutation analysis. mutation analysis subsumes various coverage measures and provides a large and diverse set of faults. We apply modern mutation analysis techniques that pool multiple mutations and allow us -- for the first time -- to evaluate and compare fuzzers with mutation analysis.
arXiv Detail & Related papers (2022-12-06T15:47:47Z)
Effective Mutation Rate Adaptation through Group Elite Selection [50.88204196504888]
This paper introduces the Group Elite Selection of Mutation Rates (GESMR) algorithm. GESMR co-evolves a population of solutions and a population of MRs, such that each MR is assigned to a group of solutions. With the same number of function evaluations and with almost no overhead, GESMR converges faster and to better solutions than previous approaches.
arXiv Detail & Related papers (2022-04-11T01:08:26Z)
MutFormer: A context-dependent transformer-based model to predict pathogenic missense mutations [5.153619184788929]
missense mutations account for approximately half of the known variants responsible for human inherited diseases. Recent advances in deep learning show that transformer models are particularly powerful at modeling sequences. We introduce MutFormer, a transformer-based model for prediction of pathogenic missense mutations.
arXiv Detail & Related papers (2021-10-27T20:17:35Z)
Detecting Rewards Deterioration in Episodic Reinforcement Learning [63.49923393311052]
In many RL applications, once training ends, it is vital to detect any deterioration in the agent performance as soon as possible. We consider an episodic framework, where the rewards within each episode are not independent, nor identically-distributed, nor Markov. We define the mean-shift in a way corresponding to deterioration of a temporal signal (such as the rewards), and derive a test for this problem with optimal statistical power.
arXiv Detail & Related papers (2020-10-22T12:45:55Z)
Noisy Adaptive Group Testing using Bayesian Sequential Experimental Design [63.48989885374238]
When the infection prevalence of a disease is low, Dorfman showed 80 years ago that testing groups of people can prove more efficient than testing people individually. Our goal in this paper is to propose new group testing algorithms that can operate in a noisy setting.
arXiv Detail & Related papers (2020-04-26T23:41:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.