Assessing test artifact quality -- A tertiary study
- URL: http://arxiv.org/abs/2402.09541v1
- Date: Wed, 14 Feb 2024 19:31:57 GMT
- Title: Assessing test artifact quality -- A tertiary study
- Authors: Huynh Khanh Vi Tran, Michael Unterkalmsteiner, J\"urgen B\"orstler,
Nauman bin Ali
- Abstract summary: We have carried out a systematic literature review to identify and analyze existing secondary studies on quality aspects of software testing artifacts.
We present an aggregation of the context dimensions and factors that can be used to characterize the environment in which the test case/suite quality is investigated.
- Score: 1.7827643249624088
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Context: Modern software development increasingly relies on software testing
for an ever more frequent delivery of high quality software. This puts high
demands on the quality of the central artifacts in software testing, test
suites and test cases. Objective: We aim to develop a comprehensive model for
capturing the dimensions of test case/suite quality, which are relevant for a
variety of perspectives. Method: We have carried out a systematic literature
review to identify and analyze existing secondary studies on quality aspects of
software testing artifacts. Results: We identified 49 relevant secondary
studies. Of these 49 studies, less than half did some form of quality appraisal
of the included primary studies and only 3 took into account the quality of the
primary study when synthesizing the results. We present an aggregation of the
context dimensions and factors that can be used to characterize the environment
in which the test case/suite quality is investigated. We also provide a
comprehensive model of test case/suite quality with definitions for the quality
attributes and measurements based on findings in the literature and ISO/IEC
25010:2011. Conclusion: The test artifact quality model presented in the paper
can be used to support test artifact quality assessment and improvement
initiatives in practice. Furtherm Information and Software Technology 139
(2021): 106620ore, the model can also be used as a framework for documenting
context characteristics to make research results more accessible for research
and practice.
Related papers
- Multi-Facet Counterfactual Learning for Content Quality Evaluation [48.73583736357489]
We propose a framework for efficiently constructing evaluators that perceive multiple facets of content quality evaluation.
We leverage a joint training strategy based on contrastive learning and supervised learning to enable the evaluator to distinguish between different quality facets.
arXiv Detail & Related papers (2024-10-10T08:04:10Z) - Computer Vision Intelligence Test Modeling and Generation: A Case Study on Smart OCR [3.0561992956541606]
We first present a comprehensive literature review of previous work, covering key facets of AI software testing processes.
We then introduce a 3D classification model to systematically evaluate the image-based text extraction AI function.
To evaluate the performance of our proposed AI software quality test, we propose four evaluation metrics to cover different aspects.
arXiv Detail & Related papers (2024-09-14T23:33:28Z) - Mashee at SemEval-2024 Task 8: The Impact of Samples Quality on the Performance of In-Context Learning for Machine Text Classification [0.0]
We employ the chi-square test to identify high-quality samples and compare the results with those obtained using low-quality samples.
Our findings demonstrate that utilizing high-quality samples leads to improved performance with respect to all evaluated metrics.
arXiv Detail & Related papers (2024-05-28T12:47:43Z) - QuRating: Selecting High-Quality Data for Training Language Models [64.83332850645074]
We introduce QuRating, a method for selecting pre-training data that can capture human intuitions about data quality.
In this paper, we investigate four qualities - writing style, required expertise, facts & trivia, and educational value.
We train a Qur model to learn scalar ratings from pairwise judgments, and use it to annotate a 260B training corpus with quality ratings for each of the four criteria.
arXiv Detail & Related papers (2024-02-15T06:36:07Z) - A manual categorization of new quality issues on automatically-generated
tests [0.8225289576465757]
We report on a manual analysis of an external dataset consisting of 2,340 automatically generated tests.
We propose a taxonomy of 13 new quality issues grouped in four categories.
We present eight recommendations that test generators may consider to improve the quality and usefulness of the automatically generated tests.
arXiv Detail & Related papers (2023-12-14T11:19:14Z) - A Novel Metric for Measuring Data Quality in Classification Applications
(extended version) [0.0]
We introduce and explain a novel metric to measure data quality.
This metric is based on the correlated evolution between the classification performance and the deterioration of data.
We provide an interpretation of each criterion and examples of assessment levels.
arXiv Detail & Related papers (2023-12-13T11:20:09Z) - Test-Case Quality -- Understanding Practitioners' Perspectives [1.7827643249624088]
We present a quality model which consists of 11 test-case quality attributes.
We identify a misalignment in defining test-case quality among practitioners and between academia and industry.
arXiv Detail & Related papers (2023-09-28T19:10:01Z) - Analyzing Dataset Annotation Quality Management in the Wild [63.07224587146207]
Even popular datasets used to train and evaluate state-of-the-art models contain a non-negligible amount of erroneous annotations, biases, or artifacts.
While practices and guidelines regarding dataset creation projects exist, large-scale analysis has yet to be performed on how quality management is conducted.
arXiv Detail & Related papers (2023-07-16T21:22:40Z) - From Static Benchmarks to Adaptive Testing: Psychometrics in AI Evaluation [60.14902811624433]
We discuss a paradigm shift from static evaluation methods to adaptive testing.
This involves estimating the characteristics and value of each test item in the benchmark and dynamically adjusting items in real-time.
We analyze the current approaches, advantages, and underlying reasons for adopting psychometrics in AI evaluation.
arXiv Detail & Related papers (2023-06-18T09:54:33Z) - Image Quality Assessment in the Modern Age [53.19271326110551]
This tutorial provides the audience with the basic theories, methodologies, and current progresses of image quality assessment (IQA)
We will first revisit several subjective quality assessment methodologies, with emphasis on how to properly select visual stimuli.
Both hand-engineered and (deep) learning-based methods will be covered.
arXiv Detail & Related papers (2021-10-19T02:38:46Z) - Quality meets Diversity: A Model-Agnostic Framework for Computerized
Adaptive Testing [60.38182654847399]
Computerized Adaptive Testing (CAT) is emerging as a promising testing application in many scenarios.
We propose a novel framework, namely Model-Agnostic Adaptive Testing (MAAT) for CAT solution.
arXiv Detail & Related papers (2021-01-15T06:48:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.