Test-Case Quality -- Understanding Practitioners' Perspectives
- URL: http://arxiv.org/abs/2309.16801v1
- Date: Thu, 28 Sep 2023 19:10:01 GMT
- Title: Test-Case Quality -- Understanding Practitioners' Perspectives
- Authors: Huynh Khanh Vi Tran, Nauman Bin Ali, J\"urgen B\"orstler, Michael
Unterkalmsteiner
- Abstract summary: We present a quality model which consists of 11 test-case quality attributes.
We identify a misalignment in defining test-case quality among practitioners and between academia and industry.
- Score: 1.7827643249624088
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Background: Test-case quality has always been one of the major concerns in
software testing. To improve test-case quality, it is important to better
understand how practitioners perceive the quality of test-cases. Objective:
Motivated by that need, we investigated how practitioners define test-case
quality and which aspects of test-cases are important for quality assessment.
Method: We conducted semi-structured interviews with professional developers,
testers and test architects from a multinational software company in Sweden.
Before the interviews, we asked participants for actual test cases (written in
natural language) that they perceive as good, normal, and bad respectively
together with rationales for their assessment. We also compared their opinions
on shared test cases and contrasted their views with the relevant literature.
Results: We present a quality model which consists of 11 test-case quality
attributes. We also identify a misalignment in defining test-case quality among
practitioners and between academia and industry, along with suggestions for
improving test-case quality in industry. Conclusion: The results show that
practitioners' background, including roles and working experience, are critical
dimensions of how test-case quality is defined and assessed.
Related papers
- Elevating Software Quality in Agile Environments: The Role of Testing Professionals in Unit Testing [0.0]
Testing is an essential quality activity in the software development process.
This paper explores the participation of test engineers in unit testing within an industrial context.
arXiv Detail & Related papers (2024-03-20T00:41:49Z) - QuRating: Selecting High-Quality Data for Training Language Models [64.83332850645074]
We introduce QuRating, a method for selecting pre-training data that can capture human intuitions about data quality.
In this paper, we investigate four qualities - writing style, required expertise, facts & trivia, and educational value.
We train a Qur model to learn scalar ratings from pairwise judgments, and use it to annotate a 260B training corpus with quality ratings for each of the four criteria.
arXiv Detail & Related papers (2024-02-15T06:36:07Z) - Assessing test artifact quality -- A tertiary study [1.7827643249624088]
We have carried out a systematic literature review to identify and analyze existing secondary studies on quality aspects of software testing artifacts.
We present an aggregation of the context dimensions and factors that can be used to characterize the environment in which the test case/suite quality is investigated.
arXiv Detail & Related papers (2024-02-14T19:31:57Z) - Automated Test Case Repair Using Language Models [0.6124773188525718]
Unrepaired broken test cases can degrade test suite quality and disrupt the software development process.
We present TaRGet, a novel approach leveraging pre-trained code language models for automated test case repair.
TaRGet treats test repair as a language translation task, employing a two-step process to fine-tune a language model.
arXiv Detail & Related papers (2024-01-12T18:56:57Z) - A Survey on What Developers Think About Testing [13.086283144520513]
We conducted a comprehensive survey with 21 questions aimed at assessing developers' current engagement with testing.
We uncover reasons that positively and negatively impact developers' motivation to test.
One approach emerging from the responses to mitigate these negative factors is by providing better recognition for developers' testing efforts.
arXiv Detail & Related papers (2023-09-03T12:18:41Z) - Analyzing Dataset Annotation Quality Management in the Wild [63.07224587146207]
Even popular datasets used to train and evaluate state-of-the-art models contain a non-negligible amount of erroneous annotations, biases, or artifacts.
While practices and guidelines regarding dataset creation projects exist, large-scale analysis has yet to be performed on how quality management is conducted.
arXiv Detail & Related papers (2023-07-16T21:22:40Z) - Test case quality: an empirical study on belief and evidence [8.475270520855332]
We investigate eight hypotheses regarding what constitutes a good test case.
Despite our best efforts, we were unable to find evidence that supports these beliefs.
arXiv Detail & Related papers (2023-07-12T19:02:48Z) - Benchmarking Foundation Models with Language-Model-as-an-Examiner [47.345760054595246]
We propose a novel benchmarking framework, Language-Model-as-an-Examiner.
The LM serves as a knowledgeable examiner that formulates questions based on its knowledge and evaluates responses in a reference-free manner.
arXiv Detail & Related papers (2023-06-07T06:29:58Z) - Exploring the Use of Large Language Models for Reference-Free Text
Quality Evaluation: An Empirical Study [63.27346930921658]
ChatGPT is capable of evaluating text quality effectively from various perspectives without reference.
The Explicit Score, which utilizes ChatGPT to generate a numeric score measuring text quality, is the most effective and reliable method among the three exploited approaches.
arXiv Detail & Related papers (2023-04-03T05:29:58Z) - Measuring Uncertainty in Translation Quality Evaluation (TQE) [62.997667081978825]
This work carries out motivated research to correctly estimate the confidence intervals citeBrown_etal2001Interval depending on the sample size of the translated text.
The methodology we applied for this work is from Bernoulli Statistical Distribution Modelling (BSDM) and Monte Carlo Sampling Analysis (MCSA)
arXiv Detail & Related papers (2021-11-15T12:09:08Z) - Generative Models are Unsupervised Predictors of Page Quality: A
Colossal-Scale Study [86.62171568318716]
Large generative language models such as GPT-2 are well-known for their ability to generate text.
We show that unsupervised predictors of "page quality" emerge, able to detect low quality content without any training.
We conduct extensive qualitative and quantitative analysis over 500 million web articles, making this the largest-scale study ever conducted on the topic.
arXiv Detail & Related papers (2020-08-17T07:13:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.