Chat-like Asserts Prediction with the Support of Large Language Model
- URL: http://arxiv.org/abs/2407.21429v1
- Date: Wed, 31 Jul 2024 08:27:03 GMT
- Title: Chat-like Asserts Prediction with the Support of Large Language Model
- Authors: Han Wang, Han Hu, Chunyang Chen, Burak Turhan,
- Abstract summary: We introduce Chat-like execution-based Asserts Prediction (tool) for generating meaningful assert statements for Python projects.
tool utilizes the persona, Chain-of-Thought, and one-shot learning techniques in the prompt design, and conducts rounds of communication with LLM and Python interpreter.
Our evaluation demonstrates that tool achieves 64.7% accuracy for single assert statement generation and 62% for overall assert statement generation.
- Score: 34.140962210930624
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unit testing is an essential component of software testing, with the assert statements playing an important role in determining whether the tested function operates as expected. Although research has explored automated test case generation, generating meaningful assert statements remains an ongoing challenge. While several studies have investigated assert statement generation in Java, limited work addresses this task in popular dynamically-typed programming languages like Python. In this paper, we introduce Chat-like execution-based Asserts Prediction (\tool), a novel Large Language Model-based approach for generating meaningful assert statements for Python projects. \tool utilizes the persona, Chain-of-Thought, and one-shot learning techniques in the prompt design, and conducts rounds of communication with LLM and Python interpreter to generate meaningful assert statements. We also present a Python assert statement dataset mined from GitHub. Our evaluation demonstrates that \tool achieves 64.7\% accuracy for single assert statement generation and 62\% for overall assert statement generation, outperforming the existing approaches. We also analyze the mismatched assert statements, which may still share the same functionality and discuss the potential help \tool could offer to the automated Python unit test generation. The findings indicate that \tool has the potential to benefit the SE community through more practical usage scenarios.
Related papers
- Multi-language Unit Test Generation using LLMs [6.259245181881262]
We describe a generic pipeline that incorporates static analysis to guide LLMs in generating compilable and high-coverage test cases.
We show how the pipeline can be applied to different programming languages, specifically Java and Python, and to complex software requiring environment mocking.
Our results demonstrate that LLM-based test generation, when guided by static analysis, can be competitive with, and even outperform, state-of-the-art test-generation techniques in coverage achieved.
arXiv Detail & Related papers (2024-09-04T21:46:18Z) - Unit Test Generation using Generative AI : A Comparative Performance
Analysis of Autogeneration Tools [2.0686733932673604]
This research aims to experimentally investigate the effectiveness of Large Language Models (LLMs) for generating unit test scripts for Python programs.
For experiments, we consider three types of code units: 1) Procedural scripts, 2) Function-based modular code, and 3) Class-based code.
Our results show that ChatGPT's performance is comparable with Pynguin in terms of coverage, though for some cases its performance is superior to Pynguin.
arXiv Detail & Related papers (2023-12-17T06:38:11Z) - A Static Evaluation of Code Completion by Large Language Models [65.18008807383816]
Execution-based benchmarks have been proposed to evaluate functional correctness of model-generated code on simple programming problems.
static analysis tools such as linters, which can detect errors without running the program, haven't been well explored for evaluating code generation models.
We propose a static evaluation framework to quantify static errors in Python code completions, by leveraging Abstract Syntax Trees.
arXiv Detail & Related papers (2023-06-05T19:23:34Z) - SAGA: Summarization-Guided Assert Statement Generation [34.51502565985728]
This paper presents a novel summarization-guided approach for automatically generating assert statements.
We leverage a pre-trained language model as the reference architecture and fine-tune it on the task of assert statement generation.
arXiv Detail & Related papers (2023-05-24T07:03:21Z) - Teaching Large Language Models to Self-Debug [62.424077000154945]
Large language models (LLMs) have achieved impressive performance on code generation.
We propose Self- Debugging, which teaches a large language model to debug its predicted program via few-shot demonstrations.
arXiv Detail & Related papers (2023-04-11T10:43:43Z) - AVATAR: A Parallel Corpus for Java-Python Program Translation [77.86173793901139]
Program translation refers to migrating source code from one language to another.
We present AVATAR, a collection of 9,515 programming problems and their solutions written in two popular languages, Java and Python.
arXiv Detail & Related papers (2021-08-26T05:44:20Z) - Infusing Finetuning with Semantic Dependencies [62.37697048781823]
We show that, unlike syntax, semantics is not brought to the surface by today's pretrained models.
We then use convolutional graph encoders to explicitly incorporate semantic parses into task-specific finetuning.
arXiv Detail & Related papers (2020-12-10T01:27:24Z) - ReAssert: Deep Learning for Assert Generation [3.8174671362014956]
We present RE-ASSERT, an approach for the automated generation of JUnit test asserts.
This is achieved by targeting projects individually, using precise code-to-test traceability for learning.
We also utilise Reformer, a state-of-the-art deep learning model, along with two models from previous work to evaluate ReAssert and an existing approach, known as ATLAS.
arXiv Detail & Related papers (2020-11-19T11:55:59Z) - Generating Accurate Assert Statements for Unit Test Cases using
Pretrained Transformers [10.846226514357866]
Unit testing represents the foundational basis of the software testing pyramid.
We present an approach to support developers in writing unit test cases by generating accurate and useful assert statements.
arXiv Detail & Related papers (2020-09-11T19:35:09Z) - Exploring Software Naturalness through Neural Language Models [56.1315223210742]
The Software Naturalness hypothesis argues that programming languages can be understood through the same techniques used in natural language processing.
We explore this hypothesis through the use of a pre-trained transformer-based language model to perform code analysis tasks.
arXiv Detail & Related papers (2020-06-22T21:56:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.