Software Testing with Large Language Models: Survey, Landscape, and
Vision
- URL: http://arxiv.org/abs/2307.07221v3
- Date: Mon, 4 Mar 2024 07:58:11 GMT
- Title: Software Testing with Large Language Models: Survey, Landscape, and
Vision
- Authors: Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, Qing
Wang
- Abstract summary: Pre-trained large language models (LLMs) have emerged as a breakthrough technology in natural language processing and artificial intelligence.
This paper provides a comprehensive review of the utilization of LLMs in software testing.
- Score: 32.34617250991638
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Pre-trained large language models (LLMs) have recently emerged as a
breakthrough technology in natural language processing and artificial
intelligence, with the ability to handle large-scale datasets and exhibit
remarkable performance across a wide range of tasks. Meanwhile, software
testing is a crucial undertaking that serves as a cornerstone for ensuring the
quality and reliability of software products. As the scope and complexity of
software systems continue to grow, the need for more effective software testing
techniques becomes increasingly urgent, making it an area ripe for innovative
approaches such as the use of LLMs. This paper provides a comprehensive review
of the utilization of LLMs in software testing. It analyzes 102 relevant
studies that have used LLMs for software testing, from both the software
testing and LLMs perspectives. The paper presents a detailed discussion of the
software testing tasks for which LLMs are commonly used, among which test case
preparation and program repair are the most representative. It also analyzes
the commonly used LLMs, the types of prompt engineering that are employed, as
well as the accompanied techniques with these LLMs. It also summarizes the key
challenges and potential opportunities in this direction. This work can serve
as a roadmap for future research in this area, highlighting potential avenues
for exploration, and identifying gaps in our current understanding of the use
of LLMs in software testing.
Related papers
- Studying and Benchmarking Large Language Models For Log Level Suggestion [49.176736212364496]
Large Language Models (LLMs) have become a focal point of research across various domains.
This paper investigates the impact of characteristics and learning paradigms on the performance of 12 open-source LLMs in log level suggestion.
arXiv Detail & Related papers (2024-10-11T03:52:17Z) - Learning to Ask: When LLMs Meet Unclear Instruction [49.256630152684764]
Large language models (LLMs) can leverage external tools for addressing a range of tasks unattainable through language skills alone.
We evaluate the performance of LLMs tool-use under imperfect instructions, analyze the error patterns, and build a challenging tool-use benchmark called Noisy ToolBench.
We propose a novel framework, Ask-when-Needed (AwN), which prompts LLMs to ask questions to users whenever they encounter obstacles due to unclear instructions.
arXiv Detail & Related papers (2024-08-31T23:06:12Z) - From LLMs to LLM-based Agents for Software Engineering: A Survey of Current, Challenges and Future [15.568939568441317]
We investigate the current practice and solutions for large language models (LLMs) and LLM-based agents for software engineering.
In particular we summarise six key topics: requirement engineering, code generation, autonomous decision-making, software design, test generation, and software maintenance.
We discuss the models and benchmarks used, providing a comprehensive analysis of their applications and effectiveness in software engineering.
arXiv Detail & Related papers (2024-08-05T14:01:15Z) - Efficient Prompting for LLM-based Generative Internet of Things [88.84327500311464]
Large language models (LLMs) have demonstrated remarkable capacities on various tasks, and integrating the capacities of LLMs into the Internet of Things (IoT) applications has drawn much research attention recently.
Due to security concerns, many institutions avoid accessing state-of-the-art commercial LLM services, requiring the deployment and utilization of open-source LLMs in a local network setting.
We propose a LLM-based Generative IoT (GIoT) system deployed in the local network setting in this study.
arXiv Detail & Related papers (2024-06-14T19:24:00Z) - Requirements are All You Need: From Requirements to Code with LLMs [0.0]
Large language models (LLMs) can be applied to software engineering tasks.
This paper introduces a tailored LLM for automating the generation of code snippets from well-structured requirements documents.
We demonstrate the LLM's proficiency in comprehending intricate user requirements and producing robust design and code solutions.
arXiv Detail & Related papers (2024-06-14T14:57:35Z) - A Software Engineering Perspective on Testing Large Language Models: Research, Practice, Tools and Benchmarks [2.8061460833143346]
Large Language Models (LLMs) are rapidly becoming ubiquitous both as stand-alone tools and as components of current and future software systems.
To enable usage of LLMs in the high-stake or safety-critical systems of 2030, they need to undergo rigorous testing.
arXiv Detail & Related papers (2024-06-12T13:45:45Z) - LLM Inference Unveiled: Survey and Roofline Model Insights [62.92811060490876]
Large Language Model (LLM) inference is rapidly evolving, presenting a unique blend of opportunities and challenges.
Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model.
This framework identifies the bottlenecks when deploying LLMs on hardware devices and provides a clear understanding of practical problems.
arXiv Detail & Related papers (2024-02-26T07:33:05Z) - Are We Testing or Being Tested? Exploring the Practical Applications of
Large Language Models in Software Testing [0.0]
A Large Language Model (LLM) represents a cutting-edge artificial intelligence model that generates coherent content.
LLM can play a pivotal role in software development, including software testing.
This study explores the practical application of LLMs in software testing within an industrial setting.
arXiv Detail & Related papers (2023-12-08T06:30:37Z) - LM-Polygraph: Uncertainty Estimation for Language Models [71.21409522341482]
Uncertainty estimation (UE) methods are one path to safer, more responsible, and more effective use of large language models (LLMs)
We introduce LM-Polygraph, a framework with implementations of a battery of state-of-the-art UE methods for LLMs in text generation tasks, with unified program interfaces in Python.
It introduces an extendable benchmark for consistent evaluation of UE techniques by researchers, and a demo web application that enriches the standard chat dialog with confidence scores.
arXiv Detail & Related papers (2023-11-13T15:08:59Z) - Towards an Understanding of Large Language Models in Software Engineering Tasks [29.30433406449331]
Large Language Models (LLMs) have drawn widespread attention and research due to their astounding performance in text generation and reasoning tasks.
The evaluation and optimization of LLMs in software engineering tasks, such as code generation, have become a research focus.
This paper comprehensively investigate and collate the research and products combining LLMs with software engineering.
arXiv Detail & Related papers (2023-08-22T12:37:29Z) - Self-Checker: Plug-and-Play Modules for Fact-Checking with Large Language Models [75.75038268227554]
Self-Checker is a framework comprising a set of plug-and-play modules that facilitate fact-checking.
This framework provides a fast and efficient way to construct fact-checking systems in low-resource environments.
arXiv Detail & Related papers (2023-05-24T01:46:07Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.