SpeCrawler: Generating OpenAPI Specifications from API Documentation
Using Large Language Models
- URL: http://arxiv.org/abs/2402.11625v1
- Date: Sun, 18 Feb 2024 15:33:24 GMT
- Title: SpeCrawler: Generating OpenAPI Specifications from API Documentation
Using Large Language Models
- Authors: Koren Lazar, Matan Vetzler, Guy Uziel, David Boaz, Esther Goldbraich,
David Amid, Ateret Anaby-Tavor
- Abstract summary: SpeCrawler is a comprehensive system that generates OpenAPI Specifications from diverse API documentation.
The paper explores SpeCrawler's methodology, supported by empirical evidence and case studies.
- Score: 8.372941103284774
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In the digital era, the widespread use of APIs is evident. However, scalable
utilization of APIs poses a challenge due to structure divergence observed in
online API documentation. This underscores the need for automatic tools to
facilitate API consumption. A viable approach involves the conversion of
documentation into an API Specification format. While previous attempts have
been made using rule-based methods, these approaches encountered difficulties
in generalizing across diverse documentation. In this paper we introduce
SpeCrawler, a comprehensive system that utilizes large language models (LLMs)
to generate OpenAPI Specifications from diverse API documentation through a
carefully crafted pipeline. By creating a standardized format for numerous
APIs, SpeCrawler aids in streamlining integration processes within API
orchestrating systems and facilitating the incorporation of tools into LLMs.
The paper explores SpeCrawler's methodology, supported by empirical evidence
and case studies, demonstrating its efficacy through LLM capabilities.
Related papers
- SEAL: Suite for Evaluating API-use of LLMs [1.2528321519119252]
SEAL is an end-to-end testbed designed to evaluate large language models in real-world API usage.
It standardizes existing benchmarks, integrates an agent system for testing API retrieval and planning, and addresses the instability of real-time APIs.
arXiv Detail & Related papers (2024-09-23T20:16:49Z) - A Systematic Evaluation of Large Code Models in API Suggestion: When, Which, and How [53.65636914757381]
API suggestion is a critical task in modern software development.
Recent advancements in large code models (LCMs) have shown promise in the API suggestion task.
arXiv Detail & Related papers (2024-09-20T03:12:35Z) - FANTAstic SEquences and Where to Find Them: Faithful and Efficient API Call Generation through State-tracked Constrained Decoding and Reranking [57.53742155914176]
API call generation is the cornerstone of large language models' tool-using ability.
Existing supervised and in-context learning approaches suffer from high training costs, poor data efficiency, and generated API calls that can be unfaithful to the API documentation and the user's request.
We propose an output-side optimization approach called FANTASE to address these limitations.
arXiv Detail & Related papers (2024-07-18T23:44:02Z) - WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment [49.00213183302225]
We propose a framework to induce new APIs by grounding wikiHow instruction to situated agent policies.
Inspired by recent successes in large language models (LLMs) for embodied planning, we propose a few-shot prompting to steer GPT-4.
arXiv Detail & Related papers (2024-07-10T15:52:44Z) - A Solution-based LLM API-using Methodology for Academic Information Seeking [49.096714812902576]
SoAy is a solution-based LLM API-using methodology for academic information seeking.
It uses code with a solution as the reasoning method, where a solution is a pre-constructed API calling sequence.
Results show a 34.58-75.99% performance improvement compared to state-of-the-art LLM API-based baselines.
arXiv Detail & Related papers (2024-05-24T02:44:14Z) - Semantic API Alignment: Linking High-level User Goals to APIs [6.494714497852088]
We present a vision to span multiple steps from requirements engineering to implementation using existing libraries.
This approach, which we call Semantic API Alignment (SEAL), aims to bridge the gap between a user's high-level goals and the specific functions of one or more APIs.
arXiv Detail & Related papers (2024-05-07T11:54:32Z) - Exploring Behaviours of RESTful APIs in an Industrial Setting [0.43012765978447565]
We propose a set of behavioural properties, common to REST APIs, which are used to generate examples of behaviours that these APIs exhibit.
These examples can be used both (i) to further the understanding of the API and (ii) as a source of automatic test cases.
Our approach can generate examples deemed relevant for understanding the system and for a source of test generation by practitioners.
arXiv Detail & Related papers (2023-10-26T11:33:11Z) - Enhancing API Documentation through BERTopic Modeling and Summarization [0.0]
This paper focuses on the complexities of interpreting Application Programming Interface (API) documentation.
Official API documentation serves as a primary source of information for developers, but it can often be extensive and lacks user-friendliness.
Our novel approach employs the strengths of BERTopic for topic modeling and Natural Language Processing (NLP) to automatically generate summaries of API documentation.
arXiv Detail & Related papers (2023-08-17T15:57:12Z) - ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world
APIs [104.37772295581088]
Open-source large language models (LLMs), e.g., LLaMA, remain significantly limited in tool-use capabilities.
We introduce ToolLLM, a general tool-usetuning encompassing data construction, model training, and evaluation.
We first present ToolBench, an instruction-tuning framework for tool use, which is constructed automatically using ChatGPT.
arXiv Detail & Related papers (2023-07-31T15:56:53Z) - Private-Library-Oriented Code Generation with Large Language Models [52.73999698194344]
This paper focuses on utilizing large language models (LLMs) for code generation in private libraries.
We propose a novel framework that emulates the process of programmers writing private code.
We create four private library benchmarks, including TorchDataEval, TorchDataComplexEval, MonkeyEval, and BeatNumEval.
arXiv Detail & Related papers (2023-07-28T07:43:13Z) - On the Effectiveness of Pretrained Models for API Learning [8.788509467038743]
Developers frequently use APIs to implement certain functionalities, such as parsing Excel Files, reading and writing text files line by line, etc.
Developers can greatly benefit from automatic API usage sequence generation based on natural language queries for building applications in a faster and cleaner manner.
Existing approaches utilize information retrieval models to search for matching API sequences given a query or use RNN-based encoder-decoder to generate API sequences.
arXiv Detail & Related papers (2022-04-05T20:33:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.