Targeted Test Selection Approach in Continuous Integration
- URL: http://arxiv.org/abs/2509.10279v1
- Date: Fri, 12 Sep 2025 14:20:51 GMT
- Title: Targeted Test Selection Approach in Continuous Integration
- Authors: Pavel Plyusnin, Aleksey Antonov, Vasilii Ermakov, Aleksandr Khaybriev, Margarita Kikot, Ilseyar Alimova, Stanislav Moiseev,
- Abstract summary: Targeted Test Selection (T-TS) is a machine learning approach for industrial test selection.<n>On live industrial data, T-TS selects only 15% of tests, reduces execution time by $5.9times$, accelerates the pipeline by $5.6times$, and detects over 95% of test failures.
- Score: 34.139736599165566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In modern software development change-based testing plays a crucial role. However, as codebases expand and test suites grow, efficiently managing the testing process becomes increasingly challenging, especially given the high frequency of daily code commits. We propose Targeted Test Selection (T-TS), a machine learning approach for industrial test selection. Our key innovation is a data representation that represent commits as Bags-of-Words of changed files, incorporates cross-file and additional predictive features, and notably avoids the use of coverage maps. Deployed in production, T-TS was comprehensively evaluated against industry standards and recent methods using both internal and public datasets, measuring time efficiency and fault detection. On live industrial data, T-TS selects only 15% of tests, reduces execution time by $5.9\times$, accelerates the pipeline by $5.6\times$, and detects over 95% of test failures. The implementation is publicly available to support further research and practical adoption.
Related papers
- DiffTester: Accelerating Unit Test Generation for Diffusion LLMs via Repetitive Pattern [6.901203999358967]
We present DiffTester, an acceleration framework specifically tailored for dLLMs in Unit Test Generation (UTG)<n>DiffTester adaptively increases the number of tokens produced at each step without compromising the quality of the output.<n>We extend the original TestEval benchmark, which was limited to Python, by introducing additional programming languages including Java and C++.
arXiv Detail & Related papers (2025-09-29T16:04:18Z) - Unit Test Update through LLM-Driven Context Collection and Error-Type-Aware Refinement [5.8748750353007635]
Test maintenance methods primarily focus on repairing broken tests, neglecting the scenario of enhancing existing tests to verify new functionality.<n>We propose TESTUPDATER, a novel approach that enables automated just-in-time test updates in response to production code changes.<n>TestUPDATER achieves a compilation pass rate of 94.4% and a test pass rate of 86.7%, outperforming the state-of-the-art method SYNTER by 15.9% and 20.0%, respectively.
arXiv Detail & Related papers (2025-09-29T08:08:22Z) - TENET: Leveraging Tests Beyond Validation for Code Generation [15.74797688806215]
Test-Driven Development (TDD) is a widely adopted software engineering practice that requires developers to create and execute tests alongside code implementation.<n>This paper introduces TENET, an agent for generating functions in complex real-world repositories under the TDD setting.<n> TENET achieves 69.08% and 81.77% Pass@1 on RepoCod and RepoEval benchmarks, outperforming the best agentic baselines by 9.49 and 2.17 percentage points, respectively.
arXiv Detail & Related papers (2025-09-29T00:53:16Z) - Impact of Code Context and Prompting Strategies on Automated Unit Test Generation with Modern General-Purpose Large Language Models [0.0]
Generative AI is gaining increasing attention in software engineering.<n>Unit tests constitute the majority of test cases and are often schematic.<n>This paper investigates the impact of code context and prompting strategies on the quality and adequacy of unit tests.
arXiv Detail & Related papers (2025-07-18T11:23:17Z) - TestForge: Feedback-Driven, Agentic Test Suite Generation [7.288137795439405]
TestForge is an agentic unit testing framework designed to cost-effectively generate high-quality test suites for real-world code.<n>TestForge produces more natural and understandable tests compared to state-of-the-art search-based techniques.
arXiv Detail & Related papers (2025-03-18T20:21:44Z) - Feature-oriented Test Case Selection and Prioritization During the Evolution of Highly-Configurable Systems [1.5225153671736202]
We introduce FeaTestSelPrio, a feature-oriented test case selection and prioritization approach for HCSs.
Our approach selects a greater number of tests and takes longer to execute than a changed-file-oriented approach, used as baseline.
The prioritization step allows reducing the average test budget in 86% of the failed commits.
arXiv Detail & Related papers (2024-06-21T16:39:10Z) - Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Align Your Prompts: Test-Time Prompting with Distribution Alignment for
Zero-Shot Generalization [64.62570402941387]
We use a single test sample to adapt multi-modal prompts at test time by minimizing the feature distribution shift to bridge the gap in the test domain.
Our method improves zero-shot top- 1 accuracy beyond existing prompt-learning techniques, with a 3.08% improvement over the baseline MaPLe.
arXiv Detail & Related papers (2023-11-02T17:59:32Z) - A Comprehensive Survey on Test-Time Adaptation under Distribution Shifts [117.72709110877939]
Test-time adaptation (TTA) has the potential to adapt a pre-trained model to unlabeled data during testing, before making predictions.<n>We categorize TTA into several distinct groups based on the form of test data, namely, test-time domain adaptation, test-time batch adaptation, and online test-time adaptation.
arXiv Detail & Related papers (2023-03-27T16:32:21Z) - Robust Test-Time Adaptation in Dynamic Scenarios [9.475271284789969]
Test-time adaptation (TTA) intends to adapt the pretrained model to test distributions with only unlabeled test data streams.
We elaborate a Robust Test-Time Adaptation (RoTTA) method against the complex data stream in PTTA.
Our method is easy to implement, making it a good choice for rapid deployment.
arXiv Detail & Related papers (2023-03-24T10:19:14Z) - TeST: Test-time Self-Training under Distribution Shift [99.68465267994783]
Test-Time Self-Training (TeST) is a technique that takes as input a model trained on some source data and a novel data distribution at test time.
We find that models adapted using TeST significantly improve over baseline test-time adaptation algorithms.
arXiv Detail & Related papers (2022-09-23T07:47:33Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z) - Reinforcement Learning for Test Case Prioritization [0.24366811507669126]
This paper extends recent studies on applying Reinforcement Learning to optimize testing strategies.
We test its ability to adapt to new environments, by testing it on novel data extracted from a financial institution.
We also studied the impact of using Decision Tree (DT) Approximator as a model for memory representation.
arXiv Detail & Related papers (2020-12-18T11:08:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.