Empirically Evaluating the Use of Bytecode for Diversity-Based Test Case Prioritisation
- URL: http://arxiv.org/abs/2504.12790v1
- Date: Thu, 17 Apr 2025 09:40:49 GMT
- Title: Empirically Evaluating the Use of Bytecode for Diversity-Based Test Case Prioritisation
- Authors: Islam T. Elgendy, Robert M. Hierons, Phil McMinn,
- Abstract summary: Regression testing assures software correctness after changes but is resource-intensive. Test Case Prioritisation ( TCP) mitigates this by ordering tests to maximise early fault detection.<n>This paper is the first to study bytecode as the basis of diversity in TCP, leveraging its compactness for improved efficiency and accuracy.
- Score: 3.1952340441132474
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Regression testing assures software correctness after changes but is resource-intensive. Test Case Prioritisation (TCP) mitigates this by ordering tests to maximise early fault detection. Diversity-based TCP prioritises dissimilar tests, assuming they exercise different system parts and uncover more faults. Traditional static diversity-based TCP approaches (i.e., methods that utilise the dissimilarity of tests), like the state-of-the-art FAST approach, rely on textual diversity from test source code, which is effective but inefficient due to its relative verbosity and redundancies affecting similarity calculations. This paper is the first to study bytecode as the basis of diversity in TCP, leveraging its compactness for improved efficiency and accuracy. An empirical study on seven Defects4J projects shows that bytecode diversity improves fault detection by 2.3-7.8% over text-based TCP. It is also 2-3 orders of magnitude faster in one TCP approach and 2.5-6 times faster in FAST-based TCP. Filtering specific bytecode instructions improves efficiency up to fourfold while maintaining effectiveness, making bytecode diversity a superior static approach.
Related papers
- Optimizing Metamorphic Testing: Prioritizing Relations Through Execution Profile Dissimilarity [2.6749261270690434]
An oracle determines whether the output of a program for executed test cases is correct.
For machine learning programs, such an oracle is often unavailable or impractical to apply.
Prioritizing MRs enhances fault detection effectiveness and improves testing efficiency.
arXiv Detail & Related papers (2024-11-14T04:14:30Z) - CodeDPO: Aligning Code Models with Self Generated and Verified Source Code [52.70310361822519]
We propose CodeDPO, a framework that integrates preference learning into code generation to improve two key code preference factors: code correctness and efficiency.
CodeDPO employs a novel dataset construction method, utilizing a self-generation-and-validation mechanism that simultaneously generates and evaluates code and test cases.
arXiv Detail & Related papers (2024-10-08T01:36:15Z) - Segment-Based Test Case Prioritization: A Multi-objective Approach [8.972346309150199]
Test case prioritization ( TCP) is a cost-efficient solution to schedule test cases in an execution order that maximizes an objective function.
We introduce a multi-objective optimization approach to prioritize UI test cases using evolutionary search algorithms and four coverage criteria.
Our approach significantly outperforms other methods in terms of Average Percentage of Faults Detected (APFD) and APFD with Cost.
arXiv Detail & Related papers (2024-08-01T16:51:01Z) - Fuzzy Inference System for Test Case Prioritization in Software Testing [0.0]
Test case prioritization ( TCP) is a vital strategy to enhance testing efficiency.
This paper introduces a novel fuzzy logic-based approach to automate TCP.
arXiv Detail & Related papers (2024-04-25T08:08:54Z) - Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - CEBin: A Cost-Effective Framework for Large-Scale Binary Code Similarity
Detection [23.8834126695488]
Binary code similarity detection (BCSD) is a fundamental technique for various application.
We propose a cost-effective BCSD framework, CEBin, which fuses embedding-based and comparison-based approaches.
arXiv Detail & Related papers (2024-02-29T03:02:07Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - Evaluating Search-Based Software Microbenchmark Prioritization [6.173678645884399]
This paper empirically evaluate single- and multi-objective search-based microbenchmark prioritization techniques.
We find that search algorithms (SAs) are only competitive with but do not outperform the best greedy, coverage-based baselines.
arXiv Detail & Related papers (2022-11-24T10:45:39Z) - A Stable, Fast, and Fully Automatic Learning Algorithm for Predictive
Coding Networks [65.34977803841007]
Predictive coding networks are neuroscience-inspired models with roots in both Bayesian statistics and neuroscience.
We show how by simply changing the temporal scheduling of the update rule for the synaptic weights leads to an algorithm that is much more efficient and stable than the original one.
arXiv Detail & Related papers (2022-11-16T00:11:04Z) - Efficient Few-Shot Object Detection via Knowledge Inheritance [62.36414544915032]
Few-shot object detection (FSOD) aims at learning a generic detector that can adapt to unseen tasks with scarce training samples.
We present an efficient pretrain-transfer framework (PTF) baseline with no computational increment.
We also propose an adaptive length re-scaling (ALR) strategy to alleviate the vector length inconsistency between the predicted novel weights and the pretrained base weights.
arXiv Detail & Related papers (2022-03-23T06:24:31Z) - Boosting Fast Adversarial Training with Learnable Adversarial
Initialization [79.90495058040537]
Adrial training (AT) has been demonstrated to be effective in improving model robustness by leveraging adversarial examples for training.
To boost training efficiency, fast gradient sign method (FGSM) is adopted in fast AT methods by calculating gradient only once.
arXiv Detail & Related papers (2021-10-11T05:37:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.