Related papers: SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video Games Using Risk Based Testing and Machine Learning

SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video Games Using Risk Based Testing and Machine Learning

URL: http://arxiv.org/abs/2203.05566v2
Date: Wed, 28 Jun 2023 16:35:23 GMT
Title: SUPERNOVA: Automating Test Selection and Defect Prevention in AAA Video Games Using Risk Based Testing and Machine Learning
Authors: Alexander Senchenko, Naomi Patterson, Hamman Samuel, Dan Isper
Abstract summary: Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems. We present SUPERNOVA, a system responsible for test selection and defect prevention while also functioning as an automation hub. The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title.
Score: 62.997667081978825
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Testing video games is an increasingly difficult task as traditional methods fail to scale with growing software systems. Manual testing is a very labor-intensive process, and therefore quickly becomes cost prohibitive. Using scripts for automated testing is affordable, however scripts are ineffective in non-deterministic environments, and knowing when to run each test is another problem altogether. The modern game's complexity, scope, and player expectations are rapidly increasing where quality control is a big portion of the production cost and delivery risk. Reducing this risk and making production happen is a big challenge for the industry currently. To keep production costs realistic up-to and after release, we are focusing on preventive quality assurance tactics alongside testing and data analysis automation. We present SUPERNOVA (Selection of tests and Universal defect Prevention in External Repositories for Novel Objective Verification of software Anomalies), a system responsible for test selection and defect prevention while also functioning as an automation hub. By integrating data analysis functionality with machine and deep learning capability, SUPERNOVA assists quality assurance testers in finding bugs and developers in reducing defects, which improves stability during the production cycle and keeps testing costs under control. The direct impact of this has been observed to be a reduction in 55% or more testing hours for an undisclosed sports game title that has shipped, which was using these test selection optimizations. Furthermore, using risk scores generated by a semi-supervised machine learning model, we are able to detect with 71% precision and 77% recall the probability of a change-list being bug inducing, and provide a detailed breakdown of this inference to developers. These efforts improve workflow and reduce testing hours required on game titles in development.

Related papers

From Code Generation to Software Testing: AI Copilot with Context-Based RAG [8.28588489551341]
We propose a novel perspective on software testing by positing bug detection and coding with fewer bugs as two interconnected problems. We introduce Copilot for Testing, an automated testing system that synchronizes bug detection with updates. Our evaluation demonstrates a 31.2% improvement in bug detection accuracy, a 12.6% increase in critical test coverage, and a 10.5% higher user acceptance rate.
arXiv Detail & Related papers (2025-04-02T16:20:05Z)
What You See Is What You Get: Attention-based Self-guided Automatic Unit Test Generation [3.8244417073114003]
We propose Attention-based Self-guided Automatic Unit Test GenERation (AUGER) approach. AUGER contains two stages: defect detection and error triggering. It makes great improvements by 4.7% to 35.3% in terms of F1-score and Precision in defect detection. It can trigger 23 to 84 more errors than state-of-the-art (SOTA) approaches in unit test generation.
arXiv Detail & Related papers (2024-12-01T14:28:48Z)
AutoPT: How Far Are We from the End2End Automated Web Penetration Testing? [54.65079443902714]
We introduce AutoPT, an automated penetration testing agent based on the principle of PSM driven by LLMs. Our results show that AutoPT outperforms the baseline framework ReAct on the GPT-4o mini model.
arXiv Detail & Related papers (2024-11-02T13:24:30Z)
The Future of Software Testing: AI-Powered Test Case Generation and Validation [0.0]
This paper explores the transformative potential of AI in improving test case generation and validation. It focuses on its ability to enhance efficiency, accuracy, and scalability in testing processes. It also addresses key challenges associated with adapting AI for testing, including the need for high quality training data.
arXiv Detail & Related papers (2024-09-09T17:12:40Z)
Leveraging Large Language Models for Efficient Failure Analysis in Game Development [47.618236610219554]
This paper proposes a new approach to automatically identify which change in the code caused a test to fail. The method leverages Large Language Models (LLMs) to associate error messages with the corresponding code changes causing the failure. Our approach reaches an accuracy of 71% in our newly created dataset, which comprises issues reported by developers at EA over a period of one year.
arXiv Detail & Related papers (2024-06-11T09:21:50Z)
Automated Test Case Repair Using Language Models [0.5708902722746041]
Unrepaired broken test cases can degrade test suite quality and disrupt the software development process. We present TaRGet, a novel approach leveraging pre-trained code language models for automated test case repair. TaRGet treats test repair as a language translation task, employing a two-step process to fine-tune a language model.
arXiv Detail & Related papers (2024-01-12T18:56:57Z)
Identifying the Risks of LM Agents with an LM-Emulated Sandbox [68.26587052548287]
Language Model (LM) agents and tools enable a rich set of capabilities but also amplify potential risks. High cost of testing these agents will make it increasingly difficult to find high-stakes, long-tailed risks. We introduce ToolEmu: a framework that uses an LM to emulate tool execution and enables the testing of LM agents against a diverse range of tools and scenarios.
arXiv Detail & Related papers (2023-09-25T17:08:02Z)
Technical Challenges of Deploying Reinforcement Learning Agents for Game Testing in AAA Games [58.720142291102135]
We describe an effort to add an experimental reinforcement learning system to an existing automated game testing solution based on scripted bots. We show a use-case of leveraging reinforcement learning in game production and cover some of the largest time sinks anyone who wants to make the same journey for their game may encounter. We propose a few research directions that we believe will be valuable and necessary for making machine learning, and especially reinforcement learning, an effective tool in game production.
arXiv Detail & Related papers (2023-07-19T18:19:23Z)
Distribution Awareness for AI System Testing [0.0]
We propose a new OOD-guided testing technique which aims to generate new unseen test cases relevant to the underlying DL system task. Our results show that this technique is able to filter up to 55.44% of error test case on CIFAR-10 and is 10.05% more effective in enhancing robustness.
arXiv Detail & Related papers (2021-05-06T09:24:06Z)
Anomaly Detection Based on Selection and Weighting in Latent Space [73.01328671569759]
We propose a novel selection-and-weighting-based anomaly detection framework called SWAD. Experiments on both benchmark and real-world datasets have shown the effectiveness and superiority of SWAD.
arXiv Detail & Related papers (2021-03-08T10:56:38Z)
Reinforcement Learning for Test Case Prioritization [0.24366811507669126]
This paper extends recent studies on applying Reinforcement Learning to optimize testing strategies. We test its ability to adapt to new environments, by testing it on novel data extracted from a financial institution. We also studied the impact of using Decision Tree (DT) Approximator as a model for memory representation.
arXiv Detail & Related papers (2020-12-18T11:08:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.