Accelerating Continuous Integration with Parallel Batch Testing
- URL: http://arxiv.org/abs/2308.13129v1
- Date: Fri, 25 Aug 2023 01:09:31 GMT
- Title: Accelerating Continuous Integration with Parallel Batch Testing
- Authors: Emad Fallahzadeh (1), Amir Hossein Bavand (1), and Peter C. Rigby (1)
((1) Concordia University, Montreal, Quebec, Canada)
- Abstract summary: Continuous integration at scale is essential to software development.
Various techniques including test selection and prioritization aim to reduce the cost.
This study evaluates parallelization's effect by adjusting the number of test machines.
We propose Dynamic TestCase, enabling new builds to join a batch before full test execution.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Continuous integration at scale is costly but essential to software
development. Various test optimization techniques including test selection and
prioritization aim to reduce the cost. Test batching is an effective
alternative, but overlooked technique. This study evaluates parallelization's
effect by adjusting machine count for test batching and introduces two novel
approaches.
We establish TestAll as a baseline to study the impact of parallelism and
machine count on feedback time. We re-evaluate ConstantBatching and introduce
DynamicBatching, which adapts batch size based on the remaining changes in the
queue. We also propose TestCaseBatching, enabling new builds to join a batch
before full test execution, thus speeding up continuous integration. Our
evaluations utilize Ericsson's results and 276 million test outcomes from
open-source Chrome, assessing feedback time, execution reduction, and providing
access to Chrome project scripts and data.
The results reveal a non-linear impact of test parallelization on feedback
time, as each test delay compounds across the entire test queue.
ConstantBatching, with a batch size of 4, utilizes up to 72% fewer machines to
maintain the actual average feedback time and provides a constant execution
reduction of up to 75%. Similarly, DynamicBatching maintains the actual average
feedback time with up to 91% fewer machines and exhibits variable execution
reduction of up to 99%. TestCaseBatching holds the line of the actual average
feedback time with up to 81% fewer machines and demonstrates variable execution
reduction of up to 67%. We recommend practitioners use DynamicBatching and
TestCaseBatching to reduce the required testing machines efficiently. Analyzing
historical data to find the threshold where adding more machines has minimal
impact on feedback time is also crucial for resource-effective testing.
Related papers
- CorrectBench: Automatic Testbench Generation with Functional Self-Correction using LLMs for HDL Design [6.414167153186868]
We propose CorrectBench, an automatic testbench generation framework with functional self-validation and self-correction.
The proposed approach can validate the correctness of the generated testbenches with a success rate of 88.85%.
Our work's performance is 62.18% higher than previous work in sequential tasks and almost 5 times the pass ratio of the direct method.
arXiv Detail & Related papers (2024-11-13T10:45:19Z) - Active Test-Time Adaptation: Theoretical Analyses and An Algorithm [51.84691955495693]
Test-time adaptation (TTA) addresses distribution shifts for streaming test data in unsupervised settings.
We propose the novel problem setting of active test-time adaptation (ATTA) that integrates active learning within the fully TTA setting.
arXiv Detail & Related papers (2024-04-07T22:31:34Z) - Taming Timeout Flakiness: An Empirical Study of SAP HANA [47.29324864511411]
Flaky tests negatively affect regression testing because they result in test failures that are not necessarily caused by code changes.
Test timeouts are one contributing factor to such flaky test failures.
Test flakiness rate ranges from 49% to 70%, depending on the number of repeated test executions.
arXiv Detail & Related papers (2024-02-07T20:01:41Z) - PACE: A Program Analysis Framework for Continuous Performance Prediction [0.0]
PACE is a program analysis framework that provides continuous feedback on the performance impact of pending code updates.
We design performance microbenchmarks by mapping the execution time of functional test cases given a code update.
Our experiments achieved significant performance in predicting code performance, outperforming current state-of-the-art by 75% on neural-represented code stylometry features.
arXiv Detail & Related papers (2023-12-01T20:43:34Z) - Align Your Prompts: Test-Time Prompting with Distribution Alignment for
Zero-Shot Generalization [64.62570402941387]
We use a single test sample to adapt multi-modal prompts at test time by minimizing the feature distribution shift to bridge the gap in the test domain.
Our method improves zero-shot top- 1 accuracy beyond existing prompt-learning techniques, with a 3.08% improvement over the baseline MaPLe.
arXiv Detail & Related papers (2023-11-02T17:59:32Z) - Towards Automatic Generation of Amplified Regression Test Oracles [44.45138073080198]
We propose a test oracle derivation approach to amplify regression test oracles.
The approach monitors the object state during test execution and compares it to the previous version to detect any changes in relation to the SUT's intended behaviour.
arXiv Detail & Related papers (2023-07-28T12:38:44Z) - Sequential Kernelized Independence Testing [101.22966794822084]
We design sequential kernelized independence tests inspired by kernelized dependence measures.
We demonstrate the power of our approaches on both simulated and real data.
arXiv Detail & Related papers (2022-12-14T18:08:42Z) - Test-time Batch Normalization [61.292862024903584]
Deep neural networks often suffer the data distribution shift between training and testing.
We revisit the batch normalization (BN) in the training process and reveal two key insights benefiting test-time optimization.
We propose a novel test-time BN layer design, GpreBN, which is optimized during testing by minimizing Entropy loss.
arXiv Detail & Related papers (2022-05-20T14:33:39Z) - SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video
Games Using Risk Based Testing and Machine Learning [62.997667081978825]
Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems.
We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub.
The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
arXiv Detail & Related papers (2022-03-10T00:47:46Z) - Automated User Experience Testing through Multi-Dimensional Performance
Impact Analysis [0.0]
We propose a novel automated user experience testing methodology.
It learns how code changes impact the time unit and system tests take, and extrapolates user experience changes based on this information.
Our open-source tool achieved 3.7% mean absolute error rate with a random forest regressor.
arXiv Detail & Related papers (2021-04-08T01:18:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.