Unit Test Case Generation with Transformers and Focal Context
- URL: http://arxiv.org/abs/2009.05617v2
- Date: Thu, 20 May 2021 23:11:01 GMT
- Title: Unit Test Case Generation with Transformers and Focal Context
- Authors: Michele Tufano, Dawn Drain, Alexey Svyatkovskiy, Shao Kun Deng, Neel
Sundaresan
- Abstract summary: AthenaTest aims to generate unit test cases by learning from real-world focal methods and developer-written test cases.
We introduce Methods2Test, the largest publicly available supervised parallel corpus of unit test case methods and corresponding focal methods in Java.
We evaluate AthenaTest on five defects4j projects, generating 25K passing test cases covering 43.7% of the focal methods with only 30 attempts.
- Score: 10.220204860586582
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Automated unit test case generation tools facilitate test-driven development
and support developers by suggesting tests intended to identify flaws in their
code. Existing approaches are usually guided by the test coverage criteria,
generating synthetic test cases that are often difficult for developers to read
or understand. In this paper we propose AthenaTest, an approach that aims to
generate unit test cases by learning from real-world focal methods and
developer-written testcases. We formulate unit test case generation as a
sequence-to-sequence learning task, adopting a two-step training procedure
consisting of denoising pretraining on a large unsupervised Java corpus, and
supervised finetuning for a downstream translation task of generating unit
tests. We investigate the impact of natural language and source code
pretraining, as well as the focal context information surrounding the focal
method. Both techniques provide improvements in terms of validation loss, with
pretraining yielding 25% relative improvement and focal context providing
additional 11.1% improvement. We also introduce Methods2Test, the largest
publicly available supervised parallel corpus of unit test case methods and
corresponding focal methods in Java, which comprises 780K test cases mined from
91K open-source repositories from GitHub. We evaluate AthenaTest on five
defects4j projects, generating 25K passing test cases covering 43.7% of the
focal methods with only 30 attempts. We execute the test cases, collect test
coverage information, and compare them with test cases generated by EvoSuite
and GPT-3, finding that our approach outperforms GPT-3 and has comparable
coverage w.r.t. EvoSuite. Finally, we survey professional developers on their
preference in terms of readability, understandability, and testing
effectiveness of the generated tests, showing overwhelmingly preference towards
AthenaTest.
Related papers
- TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark [24.14654309612826]
TestGenEval comprises 68,647 tests from 1,210 code and test file pairs across 11 well-maintained Python repositories.
It covers initial tests authoring, test suite completion, and code coverage improvements.
We evaluate several popular models, with sizes ranging from 7B to 405B parameters.
arXiv Detail & Related papers (2024-10-01T14:47:05Z) - Leveraging Large Language Models for Enhancing the Understandability of Generated Unit Tests [4.574205608859157]
We introduce UTGen, which combines search-based software testing and large language models to enhance the understandability of automatically generated test cases.
We observe that participants working on assignments with UTGen test cases fix up to 33% more bugs and use up to 20% less time when compared to baseline test cases.
arXiv Detail & Related papers (2024-08-21T15:35:34Z) - Improving LLM-based Unit test generation via Template-based Repair [8.22619177301814]
Unit test is crucial for detecting bugs in individual program units but consumes time and effort.
Large language models (LLMs) have demonstrated remarkable reasoning and generation capabilities.
In this paper, we propose TestART, a novel unit test generation method.
arXiv Detail & Related papers (2024-08-06T10:52:41Z) - CasModaTest: A Cascaded and Model-agnostic Self-directed Framework for Unit Test Generation [5.450831103980871]
CasModaTest is a cascaded, model-agnostic, and end-to-end unit test generation framework.
It generates test prefixes and test oracles and compiles or executes them to check their effectiveness.
arXiv Detail & Related papers (2024-06-22T05:52:39Z) - Constrained C-Test Generation via Mixed-Integer Programming [55.28927994487036]
This work proposes a novel method to generate C-Tests; a form of cloze tests (a gap filling exercise) where only the last part of a word is turned into a gap.
In contrast to previous works that only consider varying the gap size or gap placement to achieve locally optimal solutions, we propose a mixed-integer programming (MIP) approach.
We publish our code, model, and collected data consisting of 32 English C-Tests with 20 gaps each (totaling 3,200 individual gap responses) under an open source license.
arXiv Detail & Related papers (2024-04-12T21:35:21Z) - Observation-based unit test generation at Meta [52.4716552057909]
TestGen automatically generates unit tests, carved from serialized observations of complex objects, observed during app execution.
TestGen has landed 518 tests into production, which have been executed 9,617,349 times in continuous integration, finding 5,702 faults.
Our evaluation reveals that, when carving its observations from 4,361 reliable end-to-end tests, TestGen was able to generate tests for at least 86% of the classes covered by end-to-end tests.
arXiv Detail & Related papers (2024-02-09T00:34:39Z) - Towards Automatic Generation of Amplified Regression Test Oracles [44.45138073080198]
We propose a test oracle derivation approach to amplify regression test oracles.
The approach monitors the object state during test execution and compares it to the previous version to detect any changes in relation to the SUT's intended behaviour.
arXiv Detail & Related papers (2023-07-28T12:38:44Z) - Learning Deep Semantics for Test Completion [46.842174440120196]
We formalize the novel task of test completion to automatically complete the next statement in a test method based on the context of prior statements and the code under test.
We develop TeCo -- a deep learning model using code semantics for test completion.
arXiv Detail & Related papers (2023-02-20T18:53:56Z) - CodeT: Code Generation with Generated Tests [49.622590050797236]
We explore the use of pre-trained language models to automatically generate test cases.
CodeT executes the code solutions using the generated test cases, and then chooses the best solution.
We evaluate CodeT on five different pre-trained models with both HumanEval and MBPP benchmarks.
arXiv Detail & Related papers (2022-07-21T10:18:37Z) - Segment-level Metric Learning for Few-shot Bioacoustic Event Detection [56.59107110017436]
We propose a segment-level few-shot learning framework that utilizes both the positive and negative events during model optimization.
Our system achieves an F-measure of 62.73 on the DCASE 2022 challenge task 5 (DCASE2022-T5) validation set, outperforming the performance of the baseline prototypical network 34.02 by a large margin.
arXiv Detail & Related papers (2022-07-15T22:41:30Z) - Generating Accurate Assert Statements for Unit Test Cases using
Pretrained Transformers [10.846226514357866]
Unit testing represents the foundational basis of the software testing pyramid.
We present an approach to support developers in writing unit test cases by generating accurate and useful assert statements.
arXiv Detail & Related papers (2020-09-11T19:35:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.