Automatic Unit Test Generation for Deep Learning Frameworks based on API
Knowledge
- URL: http://arxiv.org/abs/2307.00404v1
- Date: Sat, 1 Jul 2023 18:34:56 GMT
- Title: Automatic Unit Test Generation for Deep Learning Frameworks based on API
Knowledge
- Authors: Arunkaleeshwaran Narayanan, Nima Shiri harzevili, Junjie Wang, Lin
Shi, Moshi Wei, Song Wang
- Abstract summary: We propose MUTester to generate unit test cases for APIs of deep learning frameworks.
We first propose a set of 18 rules for mining API constraints from the API documents.
We then use the frequent itemset mining technique to mine the API usage patterns from a large corpus of machine learning API related code fragments.
- Score: 11.523398693942413
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Many automatic unit test generation tools that can generate unit test cases
with high coverage over a program have been proposed. However, most of these
tools are ineffective on deep learning (DL) frameworks due to the fact that
many of deep learning APIs expect inputs that follow specific API knowledge. To
fill this gap, we propose MUTester to generate unit test cases for APIs of deep
learning frameworks by leveraging the API constraints mined from the
corresponding API documentation and the API usage patterns mined from code
fragments in Stack Overflow (SO). Particularly, we first propose a set of 18
rules for mining API constraints from the API documents. We then use the
frequent itemset mining technique to mine the API usage patterns from a large
corpus of machine learning API related code fragments collected from SO.
Finally, we use the above two types of API knowledge to guide the test
generation of existing test generators for deep learning frameworks. To
evaluate the performance of MUTester, we first collect 1,971 APIs from four
widely-used deep learning frameworks (i.e., Scikit-learn, PyTorch, TensorFlow,
and CNTK) and for each API, we further extract its API knowledge, i.e., API
constraints and API usage. Given an API, MUTester combines its API knowledge
with existing test generators (e.g., search-based test generator PyEvosuite and
random test generator PyRandoop) to generate test cases to test the API.
Results of our experiment show that MUTester can significantly improve the
corresponding test generation methods and the improvement in code coverage is
15.7% to 27.0% on average. In addition, it can help reduce around 19.0% of
invalid tests generated by the existing test generators. Our user study with 16
developers further demonstrates the practicality of MUTester in generating test
cases for deep learning frameworks.
Related papers
- KAT: Dependency-aware Automated API Testing with Large Language Models [1.7264233311359707]
KAT (Katalon API Testing) is a novel AI-driven approach that autonomously generates test cases to validate APIs.
Our evaluation of KAT using 12 real-world services shows that it can improve validation coverage, detect more undocumented status codes, and reduce false positives in these services.
arXiv Detail & Related papers (2024-07-14T14:48:18Z) - WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment [49.00213183302225]
We propose a framework to induce new APIs by grounding wikiHow instruction to situated agent policies.
Inspired by recent successes in large language models (LLMs) for embodied planning, we propose a few-shot prompting to steer GPT-4.
arXiv Detail & Related papers (2024-07-10T15:52:44Z) - DLLens: Testing Deep Learning Libraries via LLM-aided Synthesis [8.779035160734523]
Testing is a major approach to ensuring the quality of deep learning (DL) libraries.
Existing testing techniques commonly adopt differential testing to relieve the need for test oracle construction.
This paper introduces thatens, a novel differential testing technique for DL library testing.
arXiv Detail & Related papers (2024-06-12T07:06:38Z) - A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking.
It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.
Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - Leveraging Large Language Models to Improve REST API Testing [51.284096009803406]
RESTGPT takes as input an API specification, extracts machine-interpretable rules, and generates example parameter values from natural-language descriptions in the specification.
Our evaluations indicate that RESTGPT outperforms existing techniques in both rule extraction and value generation.
arXiv Detail & Related papers (2023-12-01T19:53:23Z) - Exploring Behaviours of RESTful APIs in an Industrial Setting [0.43012765978447565]
We propose a set of behavioural properties, common to REST APIs, which are used to generate examples of behaviours that these APIs exhibit.
These examples can be used both (i) to further the understanding of the API and (ii) as a source of automatic test cases.
Our approach can generate examples deemed relevant for understanding the system and for a source of test generation by practitioners.
arXiv Detail & Related papers (2023-10-26T11:33:11Z) - APICom: Automatic API Completion via Prompt Learning and Adversarial
Training-based Data Augmentation [6.029137544885093]
API recommendation is the process of assisting developers in finding the required API among numerous candidate APIs.
Previous studies mainly modeled API recommendation as the recommendation task, and developers may not yet be able to find what they need.
Motivated by the neural machine translation research domain, we can model this problem as the generation task.
We propose a novel approach APICom based on prompt learning, which can generate API related to the query according to the prompts.
arXiv Detail & Related papers (2023-09-13T15:31:50Z) - ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
APIs [104.37772295581088]
Open-source large language models (LLMs), e.g., LLaMA, remain significantly limited in tool-use capabilities.
We introduce ToolLLM, a general tool-usetuning encompassing data construction, model training, and evaluation.
We first present ToolBench, an instruction-tuning framework for tool use, which is constructed automatically using ChatGPT.
arXiv Detail & Related papers (2023-07-31T15:56:53Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - Carving UI Tests to Generate API Tests and API Specification [8.743426215048451]
API-level testing can play an important role, in-between unit-level testing and UI-level (or end-to-end) testing.
Existing API testing tools require API specifications, which often may not be available or, when available, be inconsistent with the API implementation.
We present an approach that leverages UI testing to enable API-level testing for web applications.
arXiv Detail & Related papers (2023-05-24T03:53:34Z) - torchgfn: A PyTorch GFlowNet library [56.071033896777784]
torchgfn is a PyTorch library that aims to address this need.
It provides users with a simple API for environments and useful abstractions for samplers and losses.
arXiv Detail & Related papers (2023-05-24T00:20:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.