Comparative Study of Machine Learning Test Case Prioritization for
Continuous Integration Testing
- URL: http://arxiv.org/abs/2204.10899v1
- Date: Fri, 22 Apr 2022 19:20:49 GMT
- Title: Comparative Study of Machine Learning Test Case Prioritization for
Continuous Integration Testing
- Authors: Dusica Marijan
- Abstract summary: We show that different machine learning models have different performance for different size of test history used for model training and for different time budget available for test case execution.
Our results imply that machine learning approaches for test prioritization in continuous integration testing should be carefully configured to achieve optimal performance.
- Score: 3.8073142980733
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: There is a growing body of research indicating the potential of machine
learning to tackle complex software testing challenges. One such challenge
pertains to continuous integration testing, which is highly time-constrained,
and generates a large amount of data coming from iterative code commits and
test runs. In such a setting, we can use plentiful test data for training
machine learning predictors to identify test cases able to speed up the
detection of regression bugs introduced during code integration. However,
different machine learning models can have different fault prediction
performance depending on the context and the parameters of continuous
integration testing, for example variable time budget available for continuous
integration cycles, or the size of test execution history used for learning to
prioritize failing test cases. Existing studies on test case prioritization
rarely study both of these factors, which are essential for the continuous
integration practice. In this study we perform a comprehensive comparison of
the fault prediction performance of machine learning approaches that have shown
the best performance on test case prioritization tasks in the literature. We
evaluate the accuracy of the classifiers in predicting fault-detecting tests
for different values of the continuous integration time budget and with
different length of test history used for training the classifiers. In
evaluation, we use real-world industrial datasets from a continuous integration
practice. The results show that different machine learning models have
different performance for different size of test history used for model
training and for different time budget available for test case execution. Our
results imply that machine learning approaches for test prioritization in
continuous integration testing should be carefully configured to achieve
optimal performance.
Related papers
- Which Combination of Test Metrics Can Predict Success of a Software Project? A Case Study in a Year-Long Project Course [1.553083901660282]
Testing plays an important role in securing the success of a software development project.
We investigate whether we can quantify the effects various types of testing have on functional suitability.
arXiv Detail & Related papers (2024-08-22T04:23:51Z) - Towards Explainable Test Case Prioritisation with Learning-to-Rank Models [6.289767078502329]
Test case prioritisation ( TCP) is a critical task in regression testing to ensure quality as software evolves.
We present and discuss scenarios that require different explanations and how the particularities of TCP could influence them.
arXiv Detail & Related papers (2024-05-22T16:11:45Z) - Combating Missing Modalities in Egocentric Videos at Test Time [92.38662956154256]
Real-world applications often face challenges with incomplete modalities due to privacy concerns, efficiency needs, or hardware issues.
We propose a novel approach to address this issue at test time without requiring retraining.
MiDl represents the first self-supervised, online solution for handling missing modalities exclusively at test time.
arXiv Detail & Related papers (2024-04-23T16:01:33Z) - Deep anytime-valid hypothesis testing [29.273915933729057]
We propose a general framework for constructing powerful, sequential hypothesis tests for nonparametric testing problems.
We develop a principled approach of leveraging the representation capability of machine learning models within the testing-by-betting framework.
Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines.
arXiv Detail & Related papers (2023-10-30T09:46:19Z) - Matched Machine Learning: A Generalized Framework for Treatment Effect
Inference With Learned Metrics [87.05961347040237]
We introduce Matched Machine Learning, a framework that combines the flexibility of machine learning black boxes with the interpretability of matching.
Our framework uses machine learning to learn an optimal metric for matching units and estimating outcomes.
We show empirically that instances of Matched Machine Learning perform on par with black-box machine learning methods and better than existing matching methods for similar problems.
arXiv Detail & Related papers (2023-04-03T19:32:30Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [143.14128737978342]
Test-time adaptation, an emerging paradigm, has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.
Recent progress in this paradigm highlights the significant benefits of utilizing unlabeled data for training self-adapted models prior to inference.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - DELTA: degradation-free fully test-time adaptation [59.74287982885375]
We find that two unfavorable defects are concealed in the prevalent adaptation methodologies like test-time batch normalization (BN) and self-learning.
First, we reveal that the normalization statistics in test-time BN are completely affected by the currently received test samples, resulting in inaccurate estimates.
Second, we show that during test-time adaptation, the parameter update is biased towards some dominant classes.
arXiv Detail & Related papers (2023-01-30T15:54:00Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Systematic Training and Testing for Machine Learning Using Combinatorial
Interaction Testing [0.0]
This paper demonstrates the systematic use of coverage for selecting and characterizing test and training sets for machine learning models.
The paper addresses prior criticism of coverage and provides a rebuttal which advocates the use of coverage metrics in machine learning applications.
arXiv Detail & Related papers (2022-01-28T21:33:31Z) - DeepOrder: Deep Learning for Test Case Prioritization in Continuous
Integration Testing [6.767885381740952]
This work introduces DeepOrder, a deep learning-based model that works on the basis of regression machine learning.
DeepOrder ranks test cases based on the historical record of test executions from any number of previous test cycles.
We experimentally show that deep neural networks, as a simple regression model, can be efficiently used for test case prioritization in continuous integration testing.
arXiv Detail & Related papers (2021-10-14T15:10:38Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.