Hybrid Fault-Driven Mutation Testing for Python
- URL: http://arxiv.org/abs/2601.19088v1
- Date: Tue, 27 Jan 2026 01:49:38 GMT
- Title: Hybrid Fault-Driven Mutation Testing for Python
- Authors: Saba Alimadadi, Golnaz Gharachorlu,
- Abstract summary: We introduce a novel set of seven mutation operators inspired by prevalent anti-patterns in Python programs.<n>We propose a mutation testing technique that utilizes a hybrid of static and dynamic analyses.<n>We implement our approach in a tool called PyTation and evaluate it on 13 open-source Python applications.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Mutation testing is an effective technique for assessing the effectiveness of test suites by systematically injecting artificial faults into programs. However, existing mutation testing techniques fall short in capturing many types of common faults in dynamically typed languages like Python. In this paper, we introduce a novel set of seven mutation operators that are inspired by prevalent anti-patterns in Python programs, designed to complement the existing general-purpose operators and broaden the spectrum of simulated faults. We propose a mutation testing technique that utilizes a hybrid of static and dynamic analyses to mutate Python programs based on these operators while minimizing equivalent mutants. We implement our approach in a tool called PyTation and evaluate it on 13 open-source Python applications. Our results show that PyTation generates mutants that complement those from general-purpose tools, exhibiting distinct behaviour under test execution and uncovering inadequacies in high-coverage test suites. We further demonstrate that PyTation produces a high proportion of unique mutants, a low cross-kill rate, and a low test overlap ratio relative to baseline tools, highlighting its novel fault model. PyTation also incurs few equivalent mutants, aided by dynamic analysis heuristics.
Related papers
- PRIMG : Efficient LLM-driven Test Generation Using Mutant Prioritization [0.0]
PRIMG (Prioritization and Refinement Integrated Mutation-driven Generation) is a novel framework for incremental and adaptive test case generation for Solidity smart contracts.<n> PRIMG integrates a mutation prioritization module, which employs a machine learning model trained on mutant subsumption graphs to predict the usefulness of surviving mutants.<n>The prioritization module consistently outperformed random mutant selection, enabling the generation of high-impact tests with reduced computational effort.
arXiv Detail & Related papers (2025-05-08T18:30:22Z) - Type-aware LLM-based Regression Test Generation for Python Programs [13.631541369653066]
Test4Py is a novel framework that enhances type correctness in automated test generation for Python.<n>Test4Py integrates an iterative repair procedure that progressively refines generated test cases to improve coverage.<n>In an evaluation on 183 real-world Python modules, Test4Py achieved an average statement coverage of 83.0% and branch coverage of 70.8%.
arXiv Detail & Related papers (2025-03-18T08:07:17Z) - Automated Refactoring of Non-Idiomatic Python Code: A Differentiated Replication with LLMs [54.309127753635366]
We present the results of a replication study in which we investigate GPT-4 effectiveness in recommending and suggesting idiomatic actions.<n>Our findings underscore the potential of LLMs to achieve tasks where, in the past, implementing recommenders based on complex code analyses was required.
arXiv Detail & Related papers (2025-01-28T15:41:54Z) - SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists [59.08999823652293]
We propose SYNTHEVAL to generate a wide range of test types for a comprehensive evaluation of NLP models.
In the last stage, human experts investigate the challenging examples, manually design templates, and identify the types of failures the taskspecific models consistently exhibit.
We apply SYNTHEVAL to two classification tasks, sentiment analysis and toxic language detection, and show that our framework is effective in identifying weaknesses of strong models on these tasks.
arXiv Detail & Related papers (2024-08-30T17:41:30Z) - Improving Bias Correction Standards by Quantifying its Effects on Treatment Outcomes [54.18828236350544]
Propensity score matching (PSM) addresses selection biases by selecting comparable populations for analysis.
Different matching methods can produce significantly different Average Treatment Effects (ATE) for the same task, even when meeting all validation criteria.
To address this issue, we introduce a novel metric, A2A, to reduce the number of valid matches.
arXiv Detail & Related papers (2024-07-20T12:42:24Z) - LLMorpheus: Mutation Testing using Large Language Models [5.448283690603358]
This paper presents a technique for mutation testing where placeholders are introduced at designated locations in a program's source code.<n>We find LLMorpheus to be capable of producing mutants that resemble existing bugs that cannot be produced by StrykerJS.
arXiv Detail & Related papers (2024-04-15T17:25:14Z) - An Empirical Evaluation of Manually Created Equivalent Mutants [54.02049952279685]
Less than 10 % of manually created mutants are equivalent.
Surprisingly, our findings indicate that a significant portion of developers struggle to accurately identify equivalent mutants.
arXiv Detail & Related papers (2024-04-14T13:04:10Z) - Mutation Analysis with Execution Taints [2.574469668220994]
evaluating each mutant separately means a large amount of redundant computation.
We propose execution taints--A novel technique that repurposes dynamic data-flow taints for mutation analysis.
arXiv Detail & Related papers (2024-03-02T09:20:46Z) - Precise Error Rates for Computationally Efficient Testing [67.30044609837749]
We revisit the question of simple-versus-simple hypothesis testing with an eye towards computational complexity.<n>An existing test based on linear spectral statistics achieves the best possible tradeoff curve between type I and type II error rates.
arXiv Detail & Related papers (2023-11-01T04:41:16Z) - Contextual Predictive Mutation Testing [17.832774161583036]
We introduce MutationBERT, an approach for predictive mutation testing that simultaneously encodes the source method mutation and test method.
Thanks to its higher precision, MutationBERT saves 33% of the time spent by a prior approach on checking/verifying live mutants.
We validate our input representation, and aggregation approaches for lifting predictions from the test matrix level to the test suite level, finding similar improvements in performance.
arXiv Detail & Related papers (2023-09-05T17:00:15Z) - Sequential Permutation Testing of Random Forest Variable Importance
Measures [68.8204255655161]
It is proposed here to use sequential permutation tests and sequential p-value estimation to reduce the high computational costs associated with conventional permutation tests.
The results of simulation studies confirm that the theoretical properties of the sequential tests apply.
The numerical stability of the methods is investigated in two additional application studies.
arXiv Detail & Related papers (2022-06-02T20:16:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.