Requirements Coverage-Guided Minimization for Natural Language Test Cases
- URL: http://arxiv.org/abs/2505.20004v1
- Date: Mon, 26 May 2025 13:55:33 GMT
- Title: Requirements Coverage-Guided Minimization for Natural Language Test Cases
- Authors: Rongqi Pan, Feifei Niu, Lionel C. Briand, Hanyang Hu,
- Abstract summary: Test suites tend to grow in size and often contain redundant test cases.<n>Test suite minimization aims to eliminate such redundancy while preserving key properties such as requirement coverage and fault detection capability.<n>We propose RTM (Requirement coverage-guided Test suite Minimization), a novel TSM approach designed for requirement-based testing.
- Score: 7.947774587906927
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: As software systems evolve, test suites tend to grow in size and often contain redundant test cases. Such redundancy increases testing effort, time, and cost. Test suite minimization (TSM) aims to eliminate such redundancy while preserving key properties such as requirement coverage and fault detection capability. In this paper, we propose RTM (Requirement coverage-guided Test suite Minimization), a novel TSM approach designed for requirement-based testing (validation), which can effectively reduce test suite redundancy while ensuring full requirement coverage and a high fault detection rate (FDR) under a fixed minimization budget. Based on common practice in critical systems where functional safety is important, we assume test cases are specified in natural language and traced to requirements before being implemented. RTM preprocesses test cases using three different preprocessing methods, and then converts them into vector representations using seven text embedding techniques. Similarity values between vectors are computed utilizing three distance functions. A Genetic Algorithm, whose population is initialized by coverage-preserving initialization strategies, is then employed to identify an optimized subset containing diverse test cases matching the set budget. We evaluate RTM on an industrial automotive system dataset comprising $736$ system test cases and $54$ requirements. Experimental results show that RTM consistently outperforms baseline techniques in terms of FDR across different minimization budgets while maintaining full requirement coverage. Furthermore, we investigate the impact of test suite redundancy levels on the effectiveness of TSM, providing new insights into optimizing requirement-based test suites under practical constraints.
Related papers
- Regression Testing Optimization for ROS-based Autonomous Systems: A Comprehensive Review of Techniques [6.978850097048969]
We present the first comprehensive survey systematically reviewing regression testing optimization techniques tailored for ROSAS.<n>We analyze and categorize 122 representative studies into regression test case prioritization, minimization, and selection methods.<n>We highlight major challenges specific to regression testing for ROSAS, including effectively prioritizing tests in response to frequent system modifications, efficiently minimizing redundant tests, and difficulty in accurately selecting impacted test cases.
arXiv Detail & Related papers (2025-06-19T07:43:36Z) - Taming Polysemanticity in LLMs: Provable Feature Recovery via Sparse Autoencoders [50.52694757593443]
Existing SAE training algorithms often lack rigorous mathematical guarantees and suffer from practical limitations.<n>We first propose a novel statistical framework for the feature recovery problem, which includes a new notion of feature identifiability.<n>We introduce a new SAE training algorithm based on bias adaptation'', a technique that adaptively adjusts neural network bias parameters to ensure appropriate activation sparsity.
arXiv Detail & Related papers (2025-06-16T20:58:05Z) - IT$^3$: Idempotent Test-Time Training [95.78053599609044]
Deep learning models often struggle when deployed in real-world settings due to distribution shifts between training and test data.<n>We present Idempotent Test-Time Training (IT$3$), a novel approach that enables on-the-fly adaptation to distribution shifts using only the current test instance.<n>Our results suggest that idempotence provides a universal principle for test-time adaptation that generalizes across domains and architectures.
arXiv Detail & Related papers (2024-10-05T15:39:51Z) - Scalable Similarity-Aware Test Suite Minimization with Reinforcement Learning [6.9290255098776425]
TripRL is a novel technique to produce a diverse reduced test suite with high test effectiveness.<n>We show that TripRL's runtime scales linearly with the magnitude of the Multi-Criteria Test Suite Minimization problem.
arXiv Detail & Related papers (2024-08-24T08:43:03Z) - Fuzzy Inference System for Test Case Prioritization in Software Testing [0.0]
Test case prioritization ( TCP) is a vital strategy to enhance testing efficiency.
This paper introduces a novel fuzzy logic-based approach to automate TCP.
arXiv Detail & Related papers (2024-04-25T08:08:54Z) - Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Task-specific experimental design for treatment effect estimation [59.879567967089145]
Large randomised trials (RCTs) are the standard for causal inference.
Recent work has proposed more sample-efficient alternatives to RCTs, but these are not adaptable to the downstream application for which the causal effect is sought.
We develop a task-specific approach to experimental design and derive sampling strategies customised to particular downstream applications.
arXiv Detail & Related papers (2023-06-08T18:10:37Z) - On Pitfalls of Test-Time Adaptation [82.8392232222119]
Test-Time Adaptation (TTA) has emerged as a promising approach for tackling the robustness challenge under distribution shifts.
We present TTAB, a test-time adaptation benchmark that encompasses ten state-of-the-art algorithms, a diverse array of distribution shifts, and two evaluation protocols.
arXiv Detail & Related papers (2023-06-06T09:35:29Z) - LTM: Scalable and Black-box Similarity-based Test Suite Minimization based on Language Models [0.6562256987706128]
Test suites tend to grow when software evolves, making it often infeasible to execute all test cases with the allocated testing budgets.
Test suite minimization (TSM) is employed to improve the efficiency of software testing by removing redundant test cases.
We propose LTM (Language model-based Test suite Minimization), a novel, scalable, and black-box similarity-based TSM approach.
arXiv Detail & Related papers (2023-04-03T22:16:52Z) - Sequential Kernelized Independence Testing [77.237958592189]
We design sequential kernelized independence tests inspired by kernelized dependence measures.<n>We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Robust Continual Test-time Adaptation: Instance-aware BN and
Prediction-balanced Memory [58.72445309519892]
We present a new test-time adaptation scheme that is robust against non-i.i.d. test data streams.
Our novelty is mainly two-fold: (a) Instance-Aware Batch Normalization (IABN) that corrects normalization for out-of-distribution samples, and (b) Prediction-balanced Reservoir Sampling (PBRS) that simulates i.i.d. data stream from non-i.i.d. stream in a class-balanced manner.
arXiv Detail & Related papers (2022-08-10T03:05:46Z) - Prioritized Variable-length Test Cases Generation for Finite State
Machines [0.09786690381850353]
Model-based Testing (MBT) is an effective approach for testing when parts of a system-under-test have the characteristics of a finite state machine (FSM)
This paper presents a test generation strategy that satisfies all these requirements.
Depending on the application of the FSM, the strategy and evaluation presented in this paper are applicable both in testing functional and non-functional software requirements.
arXiv Detail & Related papers (2022-03-17T20:16:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.